Name: Understanding and Improving Learning through Inference with Large Language Models
Start: 2022-10-20T04:00:00-07:00
End: 2022-10-20T05:00:00-07:00

ISI Natural Language Seminar

Understanding and Improving Learning through Inference with Large Language Models

When

Thursday, October 20, 2022 11:00am - 12:00pm PDT

Add to calendar:

Presenter

Presented by:

Sewon Min, University of Washington

Location

Conference Rm #689 in-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually.

Virtual URL

Online Link

This event is open to:

Everyone

Event Details

THIS TALK WILL NOT BE RECORDED, IT WILL BE A LIVE BROADCAST ONLY*

REMINDER:

Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom.

If you’re an outside visitor, please inform us at (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance and let you in.

In-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via the zoom registration link and online.

For more information on the NL Seminar series and upcoming talks, please visit:

https://nlg.isi.edu/nl-seminar/

Language models are capable of “learning at inference” (also referred to as in-context learning) – learning a new task by conditioning on k examples and making a prediction for a new input with no parameter updates. While impressive, models suffer from high variance and low worst-case accuracy. Moreover, we do not understand how or why in-context learning works. In the first part of the talk, I will introduce new methods that lead to significant performance gains by reducing variance and improving worst-case accuracy. I will present a new inference method as well as a new training method, of which combination enables the model to outperform a 230x bigger language model. In the second part of the talk, I will show that in-context learning in fact works very differently from conventional learning: the model does not benefit from the correctly paired training data, but rather benefit from the correct specification of the independent distribution of inputs and labels. Finally, I will conclude the talk with lessons learned, limitations and avenues for future work.

Speaker Bio

Sewon Min is a Ph.D. student in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, advised by Prof. Luke Zettlemoyer and Prof. Hannaneh Hajishirzi. She is also a part-time visiting researcher at Meta AI. Her research is in the area of natural language processing and machine learning. Her work specifically focuses on question answering, natural language understanding, knowledge representation and building general-purpose language understanding models. She is a recipient of the 2022 JP Morgan Ph.D. Fellowship. She has co-organized multiple workshops and tutorials at ACL, EMNLP, NeurIPS and AKBC, including a workshop on Machine Reading for Question Answering, a competition on Efficient Open-domain Question Answering, a workshop on Representation Learning for NLP, workshop on Semiparametric Methods in NLP, and a tutorial on Zero- and Few-shot Learning with Pretrained Language Models. Prior to UW, she obtained a B.S. degree in Computer Science & Engineering from Seoul National University.

Subscribe here to learn more about upcoming seminars: https://www.isi.edu/isi-seminar-series/

This program is open to all eligible individuals. Information Sciences Institute operates all of its programs and activities consistent with the University’s Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.