Text generation systems, especially powered by massive pretrained language models (LMs) like GPT-3, are increasingly used in real applications. However, the usual supervised training or finetuning relies on clean task-specific training data. The approach is not applicable to many emerging problems where we have access only to noisy or weak-supervision data, or biased data with spurious correlations, or no data at all. Such problems include generating text prompts for steering pretrained LMs, generating adversarial attacks, and various controllable generation tasks, etc. In this talk, I will introduce new principled modeling and learning frameworks for text generation when no (good) data is available. First, I will present a new reinforcement learning (RL) formulation for training with arbitrary reward functions. Building upon the latest advances in soft Q-learning, the approach alleviates the previous fundamental challenges of sparse reward and large action space, and creates strong results on various problems. Next, I will introduce a causal framework for controllable generation that offers a new lens for text modeling from the principled causal perspective. It allows us to eliminate generation biases inherited from training data using rich causality tools. Finally, I will present a unified evaluation framework that produces automatic metrics for evaluating diverse text generation tasks without need of human references. Those uniformly designed metrics consistently and substantially improve the human correlation across tasks.
Zhiting Hu is an Assistant Professor in Halicioglu Data Science Institute at UC San Diego. He received his Bachelor's degree in Computer Science from Peking University in 2014, and his Ph.D. in Machine Learning from Carnegie Mellon University in 2020. His research interests lie in the broad area of machine learning, artificial intelligence, natural language processing, and ML systems. In particular, He is interested in principles, methodologies, and systems of training AI agents with all types of experiences (data, symbolic knowledge, rewards, adversaries, lifelong interplay, etc), and their applications in controllable text generation, healthcare, and other application domains. His research was recognized with best demo nomination at ACL2019 and outstanding paper award at ACL2016.
Host: Muhao Chen, POC: Amy Feng
YOU ONLY NEED TO REGISTER ONCE TO ATTEND THE ENTIRE SERIES – We will send you email announcements with details of the upcoming speakers.
Register in advance for this webinar: https://usc.zoom.us/webinar/register/WN__0VhakI6Q6i3JsasdmNWcA.
After registering, you will receive an email confirmation containing information about joining the Zoom webinar.
The recording for this AI Seminar talk will be posted on our USC/ISI YouTube page within 1-2 business days: https://www.youtube.com/user/USCISI.