Seminars and Events

ISI Natural Language Seminar

Fantastic Continuous-valued Sentence Representations and How to Find Them

Event Details

Recently, large pretrained language models have seen tremendous success in few-shot and zero-shot learning settings when given appropriate prompting. For these models to excel in this few-shot paradigm, the provided prompts must condition the language model to produce the desired output sequence. We investigate a more general property of this idea: does there exist a continuous, fixed-length vector that can prompt a pretrained and frozen language model to generate any output sentence of interest? To answer this, we develop a language model agnostic sentence representation discovery algorithm, which learns a continuous-valued, fixed-length vector for a sequence by adding the vector at various locations in the language model and optimizing it to maximize the likelihood of the sequence. Experiments reveal that for nearly all English sentences (> 98\%) from different genres and corpora, we can find a vector that recovers our sentence of interest exactly without fine-tuning the underlying language model. In addition, we find that our representations can be learned in real-time and are robust to initialization; convergence requires less than 20 iterations on average using stochastic gradient descent with Adam.

This talk will mostly cover: https://arxiv.org/abs/2008.09049​

Reminder: ​Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom. If you’re an outside visitor, please inform [email protected] beforehand so we’ll be aware of your attendance and let you in.

Speaker Bio

Nishant is a predoctoral young investigator on the AllenNLP team at AI2 working with Matt Peters and part of the Masakhane community. He is broadly interested in sentence representation, ML for social good, and out-of-distribution generalization research. Previously, Nishant has spent time in industry working on document analysis, OCR, fake speech detection, and audio-driven facial animation. He also has worked with Kyunghyun Cho, Sam Bowman, and Doug Downey during his time at NYU and Northwestern on NLP and causality research.