Seminars and Events

ISI Natural Language Seminar

Benchmarking MLLMs for Embodied Decision Making and Cognitive World Modeling

Event Details

Location: ISI #1135

Speaker: Yu (Bryan) Zhou, UCLA

Abstract: While MLLMs have been widely used for spatial cognition and decision making in embodied agent environments, we still lack a systematic understanding of their performance because they are usually applied in different domains, for different purposes, and built based on different inputs and outputs. This presentation will introduce two benchmarks that aim to shed light on this problem by introducing systematic definitions and interfaces that supports the formalized evaluation of various abilities and input-output specifications of MLLM-based embodied agent modules. Furthermore, we will detail scalable data collection methods and verifiable evaluation methods based on simulation environments.

Zoom Meeting Link
Meeting ID: 919 9598 8425
Passcode: 2025

Speaker Bio

Yu Zhou is a second year CS Ph.D. student at UCLA, co-advised by Prof. Nanyun Peng and Prof. Kai-Wei Chang. His research focuses on Multimodality (Vision + Language) and Embodied Learning. He is also a part-time research scientist intern in the Segment Anything Team at Meta Superintelligence Labs. Prior to his PhD, He was a visiting researcher at Stanford SVL, working with Prof. Jiajun Wu and Prof. Fei-Fei Li. He has organized the Workshop on Foundation Models + Embodied Agents at CVPR 2025 and the Embodied Agent Interface Challenge at NeurIPS 2025. His work has led him to receive the Best Paper Award at SoCal NLP Symposium 2024.

Subscribe here to learn more about upcoming seminars: https://www.isi.edu/events/ 

Hosts: Jonathan May and Dhananjay Ashok

POC: Maura Covaci