Advances in Text Generation and the Perils of its Automatic Evaluation

ISI Natural Language Seminar

Advances in Text Generation and the Perils of its Automatic Evaluation

When

Thursday, July 1, 2021 9:00am - 10:00am PDT

Add to calendar:

Presenter

Presented by:

Kalpesh Krishna (U Mass, Amherst)

Location

Online only

Virtual Recording

This event is open to:

Everyone

Event Details

Recent advances in large-scale language modeling have significantly improved the capability of natural language generation (NLG) systems, opening up several new applications. Unfortunately, evaluating NLG systems remains challenging, making it hard to measure meaningful progress. In this talk I will present our recent efforts in building & evaluating NLG systems for 1) unsupervised sentence-level style transfer; 2) paragraph-length abstractive question answering with the ELI5 dataset. We build NLG systems (using large language models with paraphrase generation & retrieval respectively) that significantly outperform prior state-of-the-art using “standard” automatic metrics. Unfortunately, we discover several issues with the current evaluation setups, including trivial baselines (like input copying) which can game these standard metrics, even outperforming real systems. Along the way I will discuss our efforts towards rectifying these issues, and conclude with a brief mention of other projects working towards more robust NLG evaluation.
(Links to the papers this talk will primarily discuss — https://arxiv.org/abs/2010.05700, https://arxiv.org/abs/2103.06332)

Information Sciences Institute

Seminars and Events

Advances in Text Generation and the Perils of its Automatic Evaluation

Event Details