Seminars and Events

ISI Natural Language Seminar

Scale Can't Overcome Pragmatics: The Impact of Reporting Bias on Vision-Language Reasoning

Event Details

Location: Virtual

Speaker: Amita Kamath, University of Washington

Abstract: The lack of reasoning capabilities in Vision-Language Models (VLMs) has remained at the forefront of research discourse. We posit that this behavior stems from a reporting bias in their training data. That is, how people communicate about visual content by default omits tacit information needed to supervise some types of reasoning. We investigate the data underlying popular VLMs through the lens of theories from pragmatics, and find that reporting bias results in insufficient representation of various reasoning skills, despite the corpora being of web-scale, and/or synthetically generated. My talk will cover the impacts of reporting bias on vision-language model reasoning at various model+data scales, as well as a potential path forward: specifically, focusing on more intentional training data curation methods, rather than counting on scale for emergence of reasoning capabilities.

Zoom Meeting Link
Meeting ID: 970 3602 8903
Passcode: 2025

Speaker Bio

Amita Kamath is a PhD candidate in Computer Science co-advised by Kai-Wei Chang at UCLA and Ranjay Krishna at the University of Washington. She frequently collaborates with the Allen Institute for AI (AI2). Previously, Amita was a Pre-doctoral Young Investigator (PYI) at AI2, where she worked with the PRIOR team on general purpose vision-language systems. Before that, she completed her MS in Computer Science at Stanford University, where she worked with Percy Liang's group on distribution shift in NLP. She has published papers at top NLP and CV conferences such as EMNLP and CVPR. Hosts: Jonathan May and Dhananjay Ashok POC: Maura Covaci