Date
Speaker
Title
October 24, 2024
Julie Kallini
Mission: Impossible Language Models
Abstract: Chomsky and others have very directly claimed that large language models (LLMs) are equally capable of learning languages that are possible and impossible for humans to learn. However, there is very little published experimental evidence to support such a claim. Here, we develop a set of synthetic impossible languages of differing complexity, each designed by systematically altering English data with unnatural word orders and grammar rules. These languages lie on an impossibility continuum: at one end are languages that are inherently impossible, such as random and irreversible shuffles of English words, and on the other, languages that may not be intuitively impossible but are often considered so in linguistics, particularly those with rules based on counting word positions. We report on a wide range of evaluations to assess the capacity of GPT-2 small models to learn these uncontroversially impossible languages, and crucially, we perform these assessments at various stages throughout training to compare the learning process for each language. Our core finding is that GPT-2 struggles to learn impossible languages when compared to English as a control, challenging the core claim. More importantly, we hope our approach opens up a productive line of inquiry in which different LLM architectures are tested on a variety of impossible languages in an effort to learn more about how LLMs can be used as tools for these cognitive and typological investigations.
Bio: Julie Kallini is a second-year Computer Science Ph.D. student at Stanford University advised by Christopher Potts and Dan Jurafsky. Her research spans several topics in natural language processing, including computational linguistics, cognitive science, interpretability, and model architecture. Julie's work is generously supported by the NSF Graduate Research Fellowship, the Stanford School of Engineering Graduate Fellowship, and the Stanford EDGE Fellowship.
.
Talk Details: here
Recording: here
September 26, 2024
Lee Kezar
Modeling American Sign Language via Linguistic Knowledge Infusion
Abstract: As language technologies rapidly gain popularity and utility, many of the 70 million deaf and hard-of-hearing people who prefer a sign language are left behind. While NLP research into American Sign Language (ASL) is gaining popularity, we continue to face serious challenges like data scarcity and low engagement with ASL users and experts. This presentation will cover how ASL models strongly benefit from neuro-symbolically learning the linguistic structure of signs, yielding gains with respect to their data efficiency, explainability, and generalizability. Concretely, we show that phonological, morphological, and semantic knowledge "infusion" can increase sign recognition accuracy by 30%, enable few- and zero-shot sign understanding, reduce sensitivity to signer demographics, and address longstanding research questions in sign language phonology and language acquisition.
Bio: Lee Kezar (he/they) is fifth-year Ph.D. candidate in the USC Viterbi School of Engineering, advised by Jesse Thomason in the Grounding Language in Actions, Multimodal Observations, and Robotics (GLAMOR) Lab. Their research blends computational, linguistic, and psychological models of ASL to increase access to language technologies and advance theoretical perspectives on signing and co-speech gesture.
.
Talk Details: here
Recording (includes closed captioning and ASL interpreter): here
May 9, 2024
Tanmay Parekh
Event Extraction for Epidemic Prediction
Abstract: Early warnings and effective control measures are among the most important tools for policymakers to be prepared against the threat of any epidemic. Social media is an important information source here, as it is more timely than other alternatives like news and public health and is publicly accessible. Given the sheer volume of daily social media posts, there is a need for an automated system to monitor social media to provide early and effective epidemic prediction. To this end, I introduce two works to aid the creation of such an automated system using information extraction. In my first work, we pioneer exploiting Event Detection (ED) for better preparedness and early warnings of any upcoming epidemic by developing a framework to extract and analyze epidemic-related events from social media posts. We curate an epidemic event ontology comprising seven disease-agnostic event types and construct a Twitter dataset SPEED focused on the COVID-19 pandemic. Experimentation reveals how ED models trained on COVID-based SPEED can effectively detect epidemic events for three unseen epidemics of Monkeypox, Zika, and Dengue. Furthermore, we show that reporting sharp increases in the extracted events by our framework can provide warnings 4-9 weeks earlier than the WHO epidemic declaration for Monkeypox.
Since epidemics can originate across the globe, social media posts discussing them can be in varied languages. However, training supervised models on every language is a tedious and resource-expensive task. The alternative is the usage of zero-shot cross-lingual models. In this work, we introduce a new approach for label projection that can be used to generate synthetic training data in any language using the translate-train paradigm. This novel approach, CLaP, translates text to the target language and performs contextual translation on the labels using the translated text as the context, ensuring better accuracy for the translated labels. We leverage instruction-tuned language models with multilingual capabilities as our contextual translator, imposing the constraint of the presence of translated labels in the translated text via instructions. We benchmark CLaP with other label projection techniques on zero-shot cross-lingual transfer across 39 languages on two representative structured prediction tasks — event argument extraction (EAE) and named entity recognition (NER), showing over 2.4 F1 improvement for EAE and 1.4 F1 improvement for NER.
Bio: Tanmay Parekh is a third-year PhD student in Computer Science at the University of California Los Angeles (UCLA). He is advised by Prof. Nanyun Peng and Prof. Kai-Wei Chang. Previously, he completed his Masters at the Language Technologies Institute at Carnegie Mellon University (CMU) where he worked with Prof. Alan Black and Prof. Graham Neubig. He has completed his undergraduate studies at the Indian Institute of Technology Bombay (IITB). He has also worked in the industry at Amazon and Microsoft. He has worked on a wide range of research topics in multilingual, code-switching, controlled generation, and speech technologies. His current research focuses on improving the utilization and generalizability of Large Language Models (LLMs) for applications in Information Extraction (specifically Event Extraction) across various languages and domains.
.
Talk Details: https://www.isi.edu/events/4885/nl-seminar-how-to-steal-chatgpts-embedding-size-and-other-low-rank-logit-tricks/
April 25, 2024
Matthew Finlayson
How to Steal ChatGPT’s Embedding Size, and Other Low-rank Logit Tricks
Abstract: The commercialization of large language models (LLMs) has led to the common practice of restricting access to proprietary models via a limited API. In this work we show that, with only a conservative assumption about the model architecture, it is possible to learn a surprisingly large amount of non-public information about an API-protected LLM from a relatively small number of API queries (e.g., costing under $1000 USD for OpenAI’s gpt-3.5-turbo). Our findings are centered on one key observation: most modern LLMs suffer from a softmax bottleneck, which restricts the model outputs to a linear subspace of the full output space. We exploit this fact to unlock several capabilities, including (but not limited to) obtaining cheap full-vocabulary outputs, auditing for specific types of model updates, identifying the source LLM given a single full LLM output, and even efficiently discovering the LLM’s hidden size. Our empirical investigations show the effectiveness of our methods, which allow us to estimate the embedding size of OpenAI’s gpt-3.5-turbo to be about 4096. Lastly, we discuss ways that LLM providers can guard against these attacks, as well as how these capabilities can be viewed as a feature (rather than a bug) by allowing for greater transparency and accountability.
Bio: Matthew Finlayson is a PhD student studying NLP at the University of Southern California. Previously he was a predoctoral researcher at the Allen Institute for AI (AI2) after completing his bachelors degree in computer science and linguistics at Harvard University. Matthew is interested in the practical consequences of the architectural design of language models, from security to generation, as well as understanding how language models learn and generalize from data.
.
Talk Details: https://www.isi.edu/events/4885/nl-seminar-how-to-steal-chatgpts-embedding-size-and-other-low-rank-logit-tricks/
April 18, 2024
Oliver Liu
DeLLMa: A Framework for Decision Making Under Uncertainty with Large Language Models
Abstract: Large language models (LLMs) are increasingly used across society, including in domains like business, engineering, and medicine. These fields often grapple with decision-making under uncertainty, a critical yet challenging task. In this paper, we show that directly prompting LLMs on these types of decision-making problems yields poor results, especially as the problem complexity increases. To overcome this limitation, we propose DeLLMa (Decision-making Large Language Model assistant), a framework designed to enhance decision-making accuracy in uncertain environments. DeLLMa involves a multi-step scaffolding procedure, drawing upon principles from decision theory and utility theory, to provide an optimal and human-auditable decision-making process. We validate our framework on decision-making environments involving real agriculture and finance data. Our results show that DeLLMa can significantly improve LLM decision-making performance, achieving up to a 40% increase in accuracy over competing methods.
Bio: Ollie Liu (https://ollieliu.com/) is second-year Ph.D student in Computer Science at University of Southern California, co-advised by Prof. Dani Yogatama and Prof. Willie Neiswanger. In life, I usually go by Oliver 🫒 My current research interests lie in (multimodal) foundation models, especially their algorithmic reasoning capabilities and applications in sciences.
.
Talk Details: https://www.isi.edu/events/4875/nl-seminar-dellma-a-framework-for-decision-making-under-uncertainty-with-large-language-models/
April 4, 2024
Kevin Knight
30 Years of Perplexity
Abstract: NLP scientists have been trying for decades to accurately predict the next word in running text. Why were we so determined to succeed at this strange task? How did we track our successes (and failures)? Why was word prediction at the center of early statistical work in text compression, machine translation, and speech recognition? Will it lead to artificial general intelligence (AGI) in the 2020s? I'll attempt to answer these questions with anecdotes drawn from three decades of research in NLP, text compression, and code-breaking.
Bio: Dr. Kevin Knight served on the faculty of the University of Southern California (26 years), as Chief Scientist at Language Weaver, Inc. (9 years), and as Chief Scientist for Natural Language Processing at Didi Global (4 years). He received a PhD in computer science from Carnegie Mellon University and a bachelor's degree from Harvard University. Dr. Knight's research interests include machine translation, natural language generation, automata theory, decipherment of historical documents, and number theory. He has co-authored over 150 research papers on natural language processing, as well as the widely adopted textbook "Artificial Intelligence" (McGraw-Hill). Dr. Knight served as President of the Association for Computational Linguistics (ACL) in 2011, as General Chair for ACL in 2005, as General Chair for the North American ACL (NAACL) in 2016, and as co-program chair for the inaugural Asia-Pacific ACL (2020). He received an Outstanding Paper Award at NAACL 2018, and Test-of-Time awards at ACL 2022 and ACL 2023. He is a Fellow of the ACL, the Association for the Advancement of Artificial Intelligence (AAAI), and the USC Information Sciences Institute (ISI).
.
Talk Details: https://www.isi.edu/events/4667/30-years-of-perplexity/
March 28, 2024
Shivanshu Gupta
Informative Example Selection for In-Context Learning
Abstract: In-context Learning (ICL) uses large language models (LLMs) for new tasks by conditioning them on prompts comprising a few task examples. With the rise of LLMs that are intractable to train or hidden behind APIs, the importance of such a training-free interface cannot be overstated. However, ICL is known to be critically sensitive to the choice of in-context examples. Despite this, the standard approach for selecting in-context examples remains to use general-purpose retrievers due to the limited effectiveness and training requirements of prior approaches. In this talk, I’ll posit that good in-context examples demonstrate the salient information necessary to solve a given test input. I’ll present efficient approaches for selecting such examples, with a special focus on preserving the training-free ICL pipeline. Through results with a wide range of tasks and LLMs, I’ll demonstrate that selecting informative examples can indeed yield superior ICL performance.
Bio: Shivanshu Gupta is a Computer Science Ph.D. Candidate at the University of California Irvine, advised by Sameer Singh. Prior to this, he was a Research Fellow at LinkedIn and Microsoft Research India, and completed his B.Tech. and M.Tech. in Computer Science at IIT Delhi. His primary research interests are systematic generalization, in-context learning, and multi-step reasoning capabilities of large language models.
.
Talk Details: https://www.isi.edu/events/4638/informative-example-selection-for-in-context-learning/
March 21, 2024
Anthony Chen
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI
Abstract: The arms race to train language models on vast, diverse, and inconsistently documented datasets has raised pressing concerns about the legal and ethical risks for practitioners. To remedy these practices threatening data transparency and understanding, we introduce the Data Provenance Initiative, a multi-disciplinary effort between legal and machine learning experts to systematically audit and trace 1800+ text datasets. We develop tools and standards to trace the lineage of these datasets, from their source, creators, series of license conditions, properties, and subsequent use. Our landscape analysis highlights the sharp divides in composition and focus of commercially open vs closed datasets, with closed datasets monopolizing important categories: lower resource languages, more creative tasks, richer topic variety, newer and more synthetic training data.
Bio: Anthony Chen is an engineer at Google DeepMind doing research on factuality and long-context language models. He received his PhD from UC Irvine last year where he focused on generative evaluation and factuality in language models.
Talk Details: https://www.isi.edu/events/4398/nl-seminar-the-data-provenance-initiative-a-large-scale-audit-of-dataset-licensing-attribution-in-ai/
March 18, 2024
Sky C.H. Wang
Do Androids Know They're Only Dreaming of Electric Sheep?
Abstract: We design probes trained on the internal representations of a transformer language model that are predictive of its hallucinatory behavior on in-context generation tasks. To facilitate this detection, we create a span-annotated dataset of organic and synthetic hallucinations over several tasks. We find that probes trained on the force-decoded states of synthetic hallucinations are generally ecologically invalid in organic hallucination detection. Furthermore, hidden state information about hallucination appears to be task and distribution-dependent. Intrinsic and extrinsic hallucination saliency varies across layers, hidden state types, and tasks; notably, extrinsic hallucinations tend to be more salient in a transformer's internal representations. Outperforming multiple contemporary baselines, we show that probing is a feasible and efficient alternative to language model hallucination evaluation when model states are available.
Bio: Sky is a Ph.D. candidate in Computer Science at Columbia University advised by Zhou Yu and Smaranda Muresan. His research primarily revolves around Natural Language Processing (NLP), with broad interests in the area where NLP meets Computational Social Science (CSS). Here, his research primarily revolves around three major areas: (1) revealing and designing for social difference and inequality, (2) cross-cultural NLP, and (3) mechanistic interpretability. His research is supported by a NSF Graduate Research Fellowship and has received two outstanding paper awards at EMNLP. He has previously been an intern at Microsoft Semantic Machines, Google Research, and Amazon AWS AI.
.
Talk Details: https://www.isi.edu/events/4396/nl-seminar-do-androids-know-theyre-only-dreaming-of-electric-sheep/
March 7, 2024
Zixiang Chen
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Abstract: Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Language Models (LLMs). In this talk, I will introduce our newest fine-tuning method, Self-Play Fine-Tuning (SPIN), which improves LLMs without the need for additional human-annotated data. SPIN utilizes a self-play mechanism, where the LLM enhances its capabilities by generating its own training data through interactions with instances of itself. Specifically, the LLM generates its own training data from its previous iterations, refining its policy by discerning these self-generated responses from those obtained from human-annotated data. As a result, SPIN unlocks the full potential of human-annotated data for SFT. Our empirical results show that SPIN can improve the LLM's performance across a variety of benchmarks and even outperform models trained through direct preference optimization (DPO) supplemented with extra GPT-4 preference data. Additionally, I will outline the theoretical guarantees of our method. For more details and access to our codes, visit our GitHub repository (https://github.com/uclaml/SPIN).
Bio: Zixiang Chen is currently a Ph.D. student in computer science at the Department of Computer Science, University of California, Los Angeles (UCLA), advised by Prof. Quanquan Gu. He obtained his bachelor’s degree in mathematics from Tsinghua University. He is broadly interested in the theory and applications of deep learning, optimization, and control, with a focus on generative models, representation learning, and multi-agent reinforcement learning. Recently, he has been utilizing AI to enhance scientific discovery in the domain of public health. He was a visiting graduate student in the theory of reinforcement learning program at the Simons Institute for the Theory of Computing.
Talk Details: https://www.isi.edu/events/4400/nl-seminar-self-play-fine-tuning-converts-weak-language-models-to-strong-language-models/
February 22, 2024
Yihan Wang
Red Teaming Language Model Detectors with Language Models
Abstract: The prevalence and strong capability of large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive usage of LLMs, recent works have proposed algorithms to detect LLM-generated text and protect LLMs. In this paper, we investigate the robustness and reliability of these LLM detectors under adversarial attacks. We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation. In both strategies, we leverage an auxiliary LLM to generate the word replacements or the instructional prompt. Different from previous works, we consider a challenging setting where the auxiliary LLM can also be protected by a detector. Experiments reveal that our attacks effectively compromise the performance of all detectors in the study with plausible generations, underscoring the urgent need to improve the robustness of LLM-generated text detection systems. This talk may also introduce some of our other recent works on trustworthy and ethical LLMs.
Bio: Yihan is a PhD candidate in the department of Computer Science at UCLA advised by Prof. Cho-Jui Hsieh. Her research interest lies in trustworthy and generalizable machine learning. She is one of the recipients of 2023 UCLA-Amazon Fellowship. More detail can be found at https://yihanwang617.github.io.
Talk Details: https://www.isi.edu/events/4392/nl-seminar-red-teaming-language-model-detectors-with-language-models/
February 1, 2024
Yufei Tian
Harnessing Black-Box Control to Boost Commonsense in LM's Generation
Abstract: Large language models like Alpaca and GPT-3 generate coherent texts but sometimes lack commonsense, yet improving their commonsense via fine-tuning is resource expensive in terms of both data and computation. In this talk, I’ll present BOOST, a resource-efficient framework that steers a frozen Pre-Trained Language Model (PTLM) towards more reasonable outputs. This involves creating an interpretable and reference-free evaluator that assigns a sentence with a commonsensical score which grounds the sentence to a dynamic commonsense knowledge base. Using this evaluator as a guide, we extend the NADO controllable generation method to train an auxiliary head that improves the PTLM’s output. Our framework was tested on various language models, including GPT-2, Flan-T5, and Alpaca-based models. On two constrained concept-to-sentence benchmarks, human evaluation results show that BOOST consistently generates the most commonsensical content. Finally, I will demonstrate how ChatGPT outputs are different from and sometimes less favored than our outputs.
Bio: Yufei Tian is a CS PhD student at UCLA advised by Prof. Nanyun (Violet) Peng. Her research is centered around creative and controllable text generation, machine reasoning and its interaction with cognitive science, as well as designing evaluation metrics for open-ended NLG tasks. She is supported by the UCLA-Amazon fellowship program.
Talk Details: https://www.isi.edu/events/4386/nl-seminar-harnessing-black-box-control-to-boost-commonsense-in-lms-generation/
January 25, 2024
Ulf Hermjakob
𓂋𓏤𓈖𓆎𓅓𓏏𓊖 An Introduction to Egyptian from Hieroglyphs to Coptic
Abstract: The Egyptian language, with its written history of more than 5,000 years, continues to fascinate people, especially with its hieroglyphs and its complex system of logograms, phonograms and determinatives. This introduction will focus on hieroglyphs, but also cover later scripts (Hieratic, Demotic, Coptic), the decryption of hieroglyphs 200 years ago, samples from Ancient Egyptian texts, and a few linguistically interesting tidbits.
The Getty Villa has two Egyptian exhibitions on offer, The Egyptian Book of the Dead (until January 29) and Sculpted Portraits from Ancient Egypt (opening January 24). The ISI CuteLabName (NLP) group will visit the Getty Villa on Saturday afternoon (Jan. 27).
Bio: Ulf is a senior research scientist and computational linguist in the Natural Language Group at ISI, working on a wide range of languages. He has a Ph.D. in computer science from the University of Texas at Austin.
Talk Details: https://www.isi.edu/events/4394/%F0%93%82%8B%F0%93%8F%A4%F0%93%88%96%F0%93%86%8E%F0%93%85%93%F0%93%8F%8F%F0%93%8A%96-an-introduction-to-egyptian-from-hieroglyphs-to-coptic/