November 30, 2023
Machine Learning with Human Fault-Tolerance
Abstract: In machine learning, we have long recognized the need to build systems that can tolerate hardware faults and software faults. In this talk, I propose the need for a third kind of fault-tolerance: human fault-tolerance. The methods used to develop, evaluate, and deploy machine learning systems today assume that the humans that build and use them are rational actors making highly-informed decisions based on consistent preferences—this is far from true in practice. We can address the failures of these assumptions by drawing from economics, a field that has long been aware of how unfounded beliefs about human behavior can go wrong. Specifically, I will cover how we can develop theoretically grounded tools that discover human mistakes, design algorithms and methods for robustly eliciting and incorporating human feedback, and implement end-to-end platforms that make ML and NLP more transparent and reproducible. This line of work has led to the creation of datasets, models, and platforms that have been widely adopted by industry giants like Amazon, Google, and Meta.
Bio: Kawin Ethayarajh is a 5th year PhD student at Stanford University, where he works on bringing human fault-tolerance to machine learning. His research draws from economics to make machine learning and NLP more robust to the irrational, inconsistent, and uninformed human decisions made at every step. His work has been supported by a Facebook Fellowship and an NSERC PGS-D, and he has received an Outstanding Paper Award at ICML 2022. He co-created the Stanford Human Preferences dataset and the Dynaboard platform (behind Dynabench).
Talk Details: https://www.isi.edu/events/4157/machine-learning-with-human-fault-tolerance/
November 16, 2023
Cultural Knowledge and Cultural Biases: Analyzing the Multilingual Performance of Text-to-Image Models
Abstract:Despite being ostensibly trained on solely English data, most text-to-image (T2I) models carry some degree of multilingual capability, with significant variation in performance between models and languages. To guide the future development of T2I systems, both measuring and qualitatively analyzing these language-specific performance variations is desirable, to mitigate cross-lingual disparities in performance as well as language-specific demographic biases.
To quantify multilingual performance we introduce the Conceptual Coverage Across Languages (CoCo-CroLa) benchmark, which allows us to measure the “possession” of a set of tangible noun “concepts” across English, Spanish, German, Chinese, Japanese, Hebrew, and Indonesian. This technique allows us to estimate how well-suited a model is to a target language as well as identify model-specific weaknesses, spurious correlations, and biases without any a-priori assumptions of their form. We demonstrate how it can be used to rank T2I models in terms of multilinguality, and that despite its simplicity our method captures the necessary conditions for the impressive “creative” generative abilities users expect from T2I models.
We then build on this benchmarking work with a detailed qualitative analysis of “failure” and “success” cases for specific concepts. Even in the “possession” case, concepts are expressed differently across languages. These qualitative cross-lingual variations in model behaviors form a continuous spectrum of ethical acceptability, running the gamut from culturally variable popular dog breeds to racially-biased sexualization in depictions of women. While the edge cases are easy to laud or condemn, drawing the line of acceptability in between them is an open ethical question as well as an open technical challenge. Unfortunately, interventions that successfully remove the most deleterious biases also erase cultural distinctiveness, motivating a need for more targeted interventions in future work.
Bio: Michael Saxon is a CS Ph.D. candidate in the NLP Group at the University of California, Santa Barbara. His research is driven by a desire to improve our objective understanding of the semantic capabilities of large generative AI systems, in particular generative image and language models. Toward this goal he focuses on developing novel data resources and metrics for to model semantic phenomena in generative model, as well as techniques for model-driven dataset improvement to remove biases and spurious correlations. He has previously interned at Meta AI and Amazon working on NLP and speech, and is supported by the NSF Graduate Research Fellowship Program.
Talk Details: https://www.isi.edu/events/4137/nl-seminar-cultural-knowledge-and-cultural-biases-analyzing-the-multilingual-performance-of-text-to-image-models/
November 9, 2023
Manipulating Large Language Model Predictions Through Data
Abstract: Large language models use large amounts of unmoderated data at each stage of the training and deployment pipeline. In this talk, I will show how these lax requirements enable adversaries to manipulate both training and test data, allowing a myriad of possible attacks. First, during training time, I will show that adversaries can modify instruction-tuning datasets to systematically manipulate predictions across a range of tasks or induce degenerate outputs across hundreds of arbitrary tasks, using as few as 100 poison examples. At inference time, additional data is often used in retrieval- or tool-augmented models. Naturally, these models will face information from a wide variety of sources that have varying degrees of quality. Humans are also faced with this same range of sources but can make judgements of trustworthiness based on factors like the style of argumentation or the recency of information. We show that not only do model predictions differ significantly from human credibility judgements, but also that gaps in this judgement creates opportunities for adversaries to manipulate answers to user queries.
Bio:Alexander Wan is a third-year undergraduate at UC Berkeley majoring in Computer Science, Statistics, and Mathematics. He works closely with folks at the Berkeley NLP Group and the MSU Heterogeneous Learning and Reasoning lab, with a focus on improving the robustness and interpretability of large language models. He's also more broadly interested in the intersection of machine learning and cognitive science: using current ML models to better understand human cognition and building more robust models through cognitively inspired architectures and training.
Talk Details: https://www.isi.edu/events/4160/nl-seminar-manipulating-large-language-model-predictions-through-data/
November 2, 2023
What We Learned from 570K ChatGPT Interaction Logs In The Wild
Abstract: Chatbots such as GPT-4 and ChatGPT are currently serving millions of users. Despite their widespread use, there remains a lack of public datasets that showcase how these tools are used by users in practice. In this talk, I will introduce (InThe)WildChat, a corpus of 570K user-ChatGPT conversations, which comprises over 1.5 million interaction turns. I will show that, compared to other popular user-chatbot interaction datasets, WildChat offers the most diverse user prompts and presents the richest variety of potentially toxic use-cases. Finally, I will demonstrate the potential utility of this dataset in fine-tuning state-of-the-art instruction following models.
Bio: Wenting Zhao is a Ph.D. candidate in Computer Science at Cornell University. Her research focuses on improving reasoning capabilities of large language models by exploiting explicit problem structures. She organizes an ACL tutorial on complex reasoning over Natural Language and the second workshop on Natural Language Reasoning and Structured Explanations. She has done internships at IBM Research, Amazon Alexa, and AI2 Mosaic.
Talk Details: https://www.isi.edu/events/4133/nl-seminar-what-we-learned-from-570k-chatgpt-interaction-logs-in-the-wild/
October 26, 2023
Design Criteria for Human-Centered Natural Language Generation
Abstract Large language models have made substantial steps towards generating human-like language. However, this endeavor to mimic human language comes with potential drawbacks. By mimicking and appropriating human language, the systems produce language that inherits the harms and cognitive biases of humans while failing to ensure features like clarity and transparency. My research asks: how can generated language avoid the harms of natural language while supporting safe and collaborative human-AI collaboration?
Starting with the researchers, I study the quality criteria of natural language generation, using mixed methods approaches to reveal design decisions made consciously and subconsciously by natural language generation by practitioners. Looking through datasets of natural language, I identify the origins of language appropriation and illustrate the safety risks mimicry has via the linguistic miscalibration of language models. Lastly, I study how humans perceive the appropriation of social behaviors such as politeness and refusal and the risks they may pose in chat settings. What I find throughout my research is that language models inappropriately appropriate the style, the use of linguistic cues, and the prosocial language of the human text they are trained on. My future work seeks to develop design criteria for generated language, centered on user-needs, to build training methods to achieve this goal.
Bio: Kaitlyn Zhou is currently pursuing her PhD in computer science at Stanford University, advised by Dan Jurafsky. Her research focuses on investigating the unintended consequences that stem from the appropriation of natural language by language models. Her work delves into various aspects, including the fairness implications associated with the evaluation of natural language generation, the linguistic miscalibration displayed by language models, and the misplaced overconfidence of publicly deployed chatbots. Kaitlyn has previously spent summers at Microsoft Research and the Allen Institute for Artificial Intelligence. She is funded by the Stanford Graduate Fellowship and her visualization techniques have gained recognition in prominent publications like The New York Times and the Wall Street Journal. In 2018, Kaitlyn was appointed by Washington State Governor Jay Inslee to the University of Washington Board of Regents.
Talk Details: https://www.isi.edu/events/4135/design-criteria-for-human-centered-natural-language-generation/
October 19, 2023
Interactive AI Systems Specialized in Social Influence
Abstract AI research has so far focused on modeling common human skills, such as building systems to see, read, or talk. As these systems gradually achieve a human level in standard benchmarks, it is increasingly important to develop next-generation interactive AI systems with more advanced human skills, to function in realistic and critical applications such as providing personalized emotional support. In this talk, I will cover (1) how to build such expert-like AI systems specialized in social influence that can persuade, negotiate, and cooperate with other humans during conversations. (2) I will also discuss how humans perceive such specialized AI systems. This study validates the necessity of Autobot Law and proposes guidance to regulate such systems. (3) As these systems become more powerful, they are also more prone to leak users' private information. So I will describe our proposed new privacy notion, Selective Differential Privacy, and an algorithm to train privacy-preserving models with high utilities. Finally, I will conclude with my long-term vision to build a natural interface between human intelligence and machine intelligence via dialogues, from a multi-angel approach that combines Artificial Intelligence, Human-Computer Interaction, and social sciences, to develop expert AI systems for everyone.
Bio: Weiyan Shi is a postdoc at Stanford NLP and an incoming assistant professor at Northeastern University starting in 2024. Her research interests are in Natural Language Processing (NLP), especially in social influence dialogue systems such as persuasion, negotiation, and recommendation. She has also worked on privacy-preserving NLP applications. She is recognized as a Rising Star in Machine Learning by the University of Maryland. Her work on personalized persuasive dialogue systems was nominated for ACL 2019 best paper. She was also a core team member behind a Science publication on the first negotiation AI agent, Cicero, that achieves a human level in the game of Diplomacy. This work has been featured in The New York Times, The Washington Post, MIT Technology Review, Forbes, and other major media outlets.
Talk Details: https://www.isi.edu/events/4076/nl-seminar-interactive-ai-systems-specialized-in-social-influence/
October 5, 2023
On formulating and evaluating language agents
Abstract Language agents are AI systems that use large language models (LLMs) to interact with the world. While various methods have been developed, it is often hard to systematically understand or evaluate them. In this talk, we present Cognitive Architectures for Language Agents (CoALA), a theoretical framework grounded in the classical research of cognitive architectures to make sense of existing agents and shed light into future directions. We also present three benchmarks (WebShop, InterCode, Collie) to develop and evaluate language agents using web, code, and grammar respectively. Notably, all three are scalable and practical, with simple and faithful evaluation metics that do not rely on human preference labeling or LLM scoring.
Bio: Shunyu Yao is a final year Phd student with Karthik Narasimhan at Princeton NLP Group. His research focuses on language agents, and is supported by the Harold W. Dodds Fellowship from Princeton.
Talk Details: https://www.isi.edu/events/4171/on-formulating-and-evaluating-language-agents/
August 31, 2023
Phishing Emails, Improvised Explosive Devices and Quantum: A Natural Language Understanding Perspective
April 20, 2023
Modular Language Models
April 13, 2023
Drinking From The Firehose of Science
March 30, 2023
Getting AI to Do Things I Can't
March 23, 2023
Designing and Evaluating Language Models for Human Interaction
March 9, 2023
Enhancing Machine Translation with Large Language Models via Optimizing In-Context Examples and Dictionary-Based Prompting
February 2, 2023
Scaling unlocks emergent abilities in language models
January 12, 2023
Bias and Power in NLP