Insights from Re-evaluating NLP Systems

ISI Natural Language Seminar

Event Details

Although large pre trained models have achieved exceptional results on standard NLP benchmarks, it is clear that they are still far from actually understanding natural language. This gap highlights the need to develop and embrace more challenging settings for evaluation. In this talk, I will present work that re evaluates seemingly high performing NLP systems and derives insights on how these systems can be further improved. First, we will evaluate models under extreme label imbalance, a phenomenon that creates unavoidable train test mismatch. Here, collecting training data adaptively leads to dramatic improvements over static data collection. Second, we will grapple with adversarial perturbations label preserving transformations that can trigger surprising model errors. We will develop training methods to make models certifiably robust to combinatorially large families of perturbations. Finally, we will assess the utility of automatic evaluation metrics for comparing NLG systems. We will show that metrics can be surprisingly competitive with evaluation schemes that rely on human annotators, and highlight reduction of statistical bias against particular NLG systems as an important future direction.

TALK NOT RECORDED

Speaker Bio

Robin Jia will be an Assistant Professor in Computer Science at the University of Southern California starting in Fall 2021. Currently, he is a visiting researcher at Facebook AI Research, working with Luke Zettlemoyer and Douwe Kiela. He received his Ph.D. in Computer Science from Stanford University, where he was advised by Percy Liang. He is interested broadly in natural language processing and machine learning, with a particular focus on building NLP systems that are robust to distribution shift. Robins work has received an Outstanding Paper Award at EMNLP 2017 and a Best Short Paper award at ACL 2018.