Seminars and Events
Inductive Biases for Data- and Parameter-Efficient Transfer Learning
Event Details
Location: CR#689 Conference Room ISI-MDR
Speaker: Mozhdeh Gheini, USC/ISI
THIS TALK WILL “NOT” BE RECORDED, PLEASE WATCH LIVE OR ATTEND IN PERSON.
REMINDER:
Meeting hosts only admit on-line guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom.
If you’re an outside visitor, please inform us at (nlg-seminar-host(at)isi.edu) to make us aware of your attendance so we can admit you. Specify if you will attend remotely or in person at least one business day prior to the event. Provide your: full name, job title and professional affiliation and arrive at least 10 minutes before the seminar begins.
If you do not have access to the 6th Floor for in-person attendance, please check in at the 10th floor main reception desk to register as a visitor and someone will escort you to the conference room location.
https://usc.zoom.us/j/95381979100?pwd=yKkC6snFuqRddSnRCEwnVWvtP9ZdCX.1
Meeting ID: 953 8197 9100
Passcode: 911377
Data- and resource-intensive pre-training and fine-tuning applied upon Transformer-based models is the dominant paradigm at the forefront of rapid advancements in natural language processing, human language technologies, and most notably, large language models. Such reliance on massive amounts of data, computation, and energy, while effective and impressive from a performance-only perspective, can hinder open, nonexclusive, and sustainable development of these technologies. In this talk, we present how certain inductive biases can be devised to adjust current natural language methods under resource-constrained scenarios and provide insights into why the proposed inductive biases are successful in such cases.
Specifically, we discuss four research directions on data and parameter efficiency of fine-tuning and transfer learning in natural language processing: (1) a universal regimen that creates a single pre-trained checkpoint suitable for machine translation transfer to practically any language pair and eliminates the need for ad hoc pre-training; (2) an architecture-guided parameter-efficient fine-tuning method that performs competitively with full fine-tuning while exclusively updating cross-attention parameters; (3) an analysis of Mega, a recently introduced augmentation of the Transformer architecture to incorporate explicit recency bias, through the lens of transfer learning; and (4) a meta-learning algorithm to prime pre-trained models for specific fine-tuning strategies.
Combined with ablations that show how they are effective and analyses that demonstrate their generalizability, these directions are meant to serve as tools for resource-efficient transfer learning for natural language processing.
Speaker Bio
Mozhdeh "Mo" Gheini is a PhD candidate at the University of Southern California advised by Jonathan May. Her PhD focus has been on investigating different inductive biases to build data- and parameter-efficient methods for transfer learning for natural language processing tasks like machine translation and beyond. She has also spent three summers interning with Apple, where she will be joining again in February as Machine Learning Research Engineer.
Subscribe here to learn more about upcoming seminars: https://www.isi.edu/events/
For more information on the NL Seminar series and upcoming talks, please visit: