Generating Adversarial Examples with Syntactically Controlled Paraphrase Networks

Friday, March 30, 2018, 3:00 pm - 4:00 pm PSTiCal
Conf. Rm #1135
This event is open to the public.
NL Seminar
Mohit Iyyer (AI2, UMass Amherst)

Abstract: Many datasets for natural language processing problems lack linguistic variation, which hurts generalization of models trained on them. Recent research has shown that it is possible to break many learned models by evaluating them on adversarial examples, which are generated by manually introducing lexical, pragmatic, and syntactic variation to existing held-out examples from the data. Automating this process is challenging, as input semantics must be preserved in the face of potentially large sentence modifications. In this talk, I will focus specifically on syntactic variation in discussing our recent work on syntactically controlled paraphrase networks (SCPN) for adversarial example generation.

Given a sentence and a target syntactic form (e.g., a constituency parse), an SCPN is trained to produce a paraphrase of the sentence with the desired syntax. We show it is possible to create training data for this task by first doing backtranslation at a very large scale, and then using a parser to label the syntactic transformations that naturally occur during this process. Such data allows us to train a neural encoder-decoder model with extra inputs to specify the target syntax. A combination of automated and human evaluations show that SCPNs generate paraphrases that almost always follow their target specifications without decreasing paraphrase quality when compared to baseline (uncontrolled) paraphrase systems. Furthermore, they are more capable of generating syntactically adversarial examples that both (1) "fool" pretrained models and (2) improve the robustness of these models to syntactic variation when used for data augmentation.

Bio: Mohit Iyyer will be joining UMass Amherst as an assistant professor in Fall 2018. Currently, he is a Young Investigator at the Allen Institute of Artificial Intelligence; prior to that, he received a Ph.D. from the Department of Computer Science at the University of Maryland, College Park, advised by Jordan Boyd-Graber and Hal Daumé III. His research interests lie at the intersection of natural language processing and machine learning. More specifically, he focuses on designing deep neural networks for both traditional NLP tasks (e.g., question answering, language generation) and new problems that involve creative language (e.g., understanding narratives in novels). He has interned at MetaMind and Microsoft Research, and his research has won a best paper award at NAACL 2016 and a best demonstration award at NIPS 2015.

« Return to Upcoming Events