Seminars and Events
Fantastic Continuous-valued Sentence Representations and How to Find Them
Event Details
Recently, large pretrained language models have seen tremendous success in few-shot and zero-shot learning settings when given appropriate prompting. For these models to excel in this few-shot paradigm, the provided prompts must condition the language model to produce the desired output sequence. We investigate a more general property of this idea: does there exist a continuous, fixed-length vector that can prompt a pretrained and frozen language model to generate any output sentence of interest? To answer this, we develop a language model agnostic sentence representation discovery algorithm, which learns a continuous-valued, fixed-length vector for a sequence by adding the vector at various locations in the language model and optimizing it to maximize the likelihood of the sequence. Experiments reveal that for nearly all English sentences (> 98\%) from different genres and corpora, we can find a vector that recovers our sentence of interest exactly without fine-tuning the underlying language model. In addition, we find that our representations can be learned in real-time and are robust to initialization; convergence requires less than 20 iterations on average using stochastic gradient descent with Adam.
This talk will mostly cover: https://arxiv.org/abs/2008.09049
Reminder: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom. If you’re an outside visitor, please inform [email protected] beforehand so we’ll be aware of your attendance and let you in.