Seminars and Events

ISI Natural Language Seminar

1.) On Exploiting Context Usage in Document-Level Neural Machine Translation 2.) Improving Moderation of Online Discussions via Nonviolent Communication 3.) Heritage-aware language model adaptation for diasporic languages

Event Details


Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom.

If you’re an outside visitor, please inform us at (nlg-seminar-host(at) beforehand so we’ll be aware of your attendance and let you in.

In-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via the zoom registration link and online.

For more information on the NL Seminar series and upcoming talks, please visit: 

1.)Jacqueline He’s Abstract

A crucial limitation of current sentence-level machine translation systems is their inability to account for context. By processing each sentence in isolation, existing neural machine translation (NMT) systems are prone to missing important document-level cues and demonstrate a poor understanding of inter-sentential discourse properties, resulting in a noticeable quality difference between human-translated and machine-translated text. In this talk, we will discuss ongoing efforts to construct NMT models that can effectively harness context. We primarily focus on the popular IWSLT’17 English-to-French translation task, and compare against a strong concatenation-based Transformer (Vaswani et al., 2017) baseline. First, we corroborate existing findings (Fernandes et al. 2021) that increasing context can improve translation performance, though with diminishing returns. We hypothesize that the Transformer’s self-attention mechanism may be insufficient for handling long-range dependencies across sentences, both inside and outside of the context window. We then explore replacing the Transformer with a novel neural architecture whose attention layer is based on an exponential moving average to exploit both local and global contexts. Finally, we will discuss a chunk-based strategy towards encoding and decoding text, and conclude with future directions.

2.)Taiwei Shi’s Abstract

The growing number of comments makes online discussions problematic to moderate by human moderators only. A crucial limitation of current automated moderation is that the generations are repetitive, generic, and judgmental, which is not effective in terms of changing someone’s mind and behaviors. We seek to build a moderation system that can intervene in an adversarial conversation involving participants that have abandoned reasoned discussion and descended into personal attacks. While also a difficult problem among humans, we would like to explore the effectiveness of Nonviolent Communication (NVC), an approach to restoring breakdowns in communication. In this talk, we will discuss the strategies of incorporating one aspect of NVC called observation without evaluation (O-vs-E) into dialogue models. First, we obtain a sufficiently large set of O-vs-E dialogue data to train an O-vs-E classifier. We then expand this to a sufficiently large set to fine-tune a dialogue model. We also explore text style transfer to rewrite moderation datasets, so the model could actively intervene in toxic conversations while being observational at the same time. In addition, we also investigate prompt engineering for GPT-3, which yields surprisingly better performance than our fine-tuned dialogue model. Finally, we will discuss the strategies for evaluating the model performance and conclude with future directions.

3.)Jonne Saleva’s Abstract

Multilingual language models (MLLMs) have proven their effectiveness as cross-lingual representation learners that perform well on a variety of languages, including many lower-resourced and zero-shot ones.

Although effective, MLLMs remain somewhat opaque and the nature of their cross-linguistic transfer is difficult to understand. While it seems plausible that higher- and lower-resourced languages should share information within a model, what is less clear is how such transfer is mediated by linguistic relatedness.

In this talk, we investigate this problem through the lens of diasporic languages which can be (crudely) understood as a combination of two ancestral languages, a “co-cultural language” and a “co-territorial language”.

Specifically, we ask whether using data in these languages (or some mixture of them) during MLLM adaptation can improve performance on lower-resourced diasporic languages, both in terms of perplexity and extrinsic named entity recognition task.

We outline results on Yiddish (a Germanic language spoken by Ashkenazi Jews) and Ladino (a Romance language spoken by Sephardic Jews) and discuss the effectiveness of using German, Spanish and Hebrew as ancestral languages.

Finally, we combine ancestral language-based adaptation with recent lexicon-based approach proposed by Wang et al (2022) and conclude with directions for future work.

Speaker Bio

1.)Jacqueline He's Bio

Jacqueline He is a current summer intern for the Natural Language Group at USC ISI under Professors Jonathan May and Xuezhe Ma. She recently graduated from Princeton University with a bachelor’s degree in Computer Science. Her current research interest orients around contextual aware neural machine translation, and she has previously worked on interpretability and ethics in NLP.

2.)Taiwei Shi's Bio

Taiwei Shi is a current summer intern for the Natural Language Group at USC ISI under Professors Jonathan May and Xuezhe Ma. He is also an undergraduate student at the Georgia Institute of Technology, majoring in Computer Science and Mathematics. He has previously worked at Georgia Tech’s SALT lab under Professor Diyi Yang. He is working towards a career where he can pursue his interests and make an impact in natural language processing, especially in the fields of computational social science and philosophy.

3.)Jonne Saleva's Bio

Jonne Sälevä is a summer intern in the Natural Language Group at USC ISI, working on language modeling for lower-resourced diasporic languages under Prof. Jonathan May.

Jonne is also a Ph.D. student in Computer Science at Brandeis University, where he is working on NLP for morphologically rich and lower-resourced languages as part of the Broadening Linguistic Technologies Lab led by Prof. Constantine Lignos.

Prior to his doctoral studies, Jonne received an M.S. in Computer Science from Brandeis University in 2021 and an A.B. in Statistics from Harvard College in 2017.

The recording for this NL Seminar talk will be posted on our USC/ISI YouTube page within 1-2 business days:

Subscribe here to learn more about upcoming seminars: