Seminars and Events
Scale Can't Overcome Pragmatics: The Impact of Reporting Bias on Vision-Language Reasoning
Event Details
Location: Virtual
Speaker: Amita Kamath, University of Washington
Abstract: The lack of reasoning capabilities in Vision-Language Models (VLMs) has remained at the forefront of research discourse. We posit that this behavior stems from a reporting bias in their training data. That is, how people communicate about visual content by default omits tacit information needed to supervise some types of reasoning. We investigate the data underlying popular VLMs through the lens of theories from pragmatics, and find that reporting bias results in insufficient representation of various reasoning skills, despite the corpora being of web-scale, and/or synthetically generated. My talk will cover the impacts of reporting bias on vision-language model reasoning at various model+data scales, as well as a potential path forward: specifically, focusing on more intentional training data curation methods, rather than counting on scale for emergence of reasoning capabilities.
Zoom Meeting Link
Meeting ID: 970 3602 8903
Passcode: 2025