Seminars and Events

ISI Natural Language Seminar

Modular Language Models

Event Details


Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom.

If you’re an outside visitor, please inform us at (nlg-seminar-host(at) beforehand so we’ll be aware of your attendance and let you in.

In-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via the zoom link.

For more information on the NL Seminar series and upcoming talks, please visit: 

Conventional language models (LMs) are trained densely: all parameters are updated with respect to all data. We argue that dense training leads to a variety of well-documented issues with LMs, including their prohibitive training cost and unreliable downstream behavior. We then introduce a new class of LMs that are fundamentally modular, where components (or experts) of the LM are specialized to distinct domains in the training corpus, and experts are conditionally updated based on the domain of the incoming document. We show how modularity addresses the limitations of dense training by enabling LMs that are rapidly customizable (with the ability to mix, add, or remove experts after training), embarrassingly parallel (requiring no communication between experts), and sparse (needing only a few experts active at a time for inference). Key to our proposal is exploring what constitutes the domains to which experts specialize, as well as reflecting on the data sources used to train LMs. Our new techniques chart a path towards collaborative LM development, where anyone can contribute and maintain experts at very modest computational cost.

Speaker Bio

Suchin Gururangan is a 3rd year PhD candidate at the University of Washington, advised by Noah A. Smith and Luke Zettlemoyer. He was previously a visiting researcher at Meta AI, a pre-doctoral resident at the Allen Institute for AI, and spent several years in industry as a data scientist. His research interests span many areas of NLP; currently he works on modular, sparse language models that are efficient to customize and scale. His work has received awards at ACL 2020 and 2021, and he is supported by the Bloomberg Data Science PhD Fellowship.

If speaker approves to be recorded for this NL Seminar talk, it will be posted on our USC/ISI YouTube page within 1-2 business days:

Subscribe here to learn more about upcoming seminars: