ISI at ACL 2025

by Julia Cohen

students connected like nodes
Image Credit: gorodenkoff/iStock

At the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), held July 27–August 1 in Vienna, Austria, researchers from USC Viterbi’s Information Sciences Institute (ISI) will present ten papers advancing the field of natural language processing. The work spans authorship verification, educational applications, safety and alignment, and the limits of language model understanding across both language and vision.

Highlighted Research

Do Language Models Know How to Teach?

Brihi Joshi, Keyu He, Sahana Ramnath, Sadra Sabouri, Kaitlyn Zhou, Souti Chattopadhyay, Swabha Swayamdipta, Xiang Ren
ACL Findings 2025

Language models are increasingly used in classrooms, but how well do they adjust their answers for different learning levels? In the paper, ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations, researchers introduce a benchmark of 13,000 “why” questions and show that GPT-4 matches the intended grade level only about half the time—far below human-written explanations. The results raise concerns about using LLMs for real educational support without targeted improvements.

A Better Way to Judge Creative Writing

Brihi Joshi, Sriram Venkatapathy, Mohit Bansal, Nanyun Peng, Haw-Shiuan Chang
Oral Talk at GEM2 Workshop: Generation, Evaluation & Metrics, ACL 25

Language models struggle to evaluate stories because they often latch onto surface-level cues rather than substance. This paper, CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization, presents a new method that helps models focus on specific aspects, like plot or character, by generating keywords before scoring. The approach does a better job than GPT-4, and uses much smaller models that more closely match human opinions.

A More Realistic Jailbreak Attack

Yifan Jiang, Kriti Aggarwal, Tanmay Laud, Kashif Munir, Jay Pujara, Subhabrata Mukherjee
ACL Findings 2025

Most jailbreak attacks use single-turn prompts with obvious harmful intent. The paper Red Queen: Exposing Latent Multi-Turn Risks in Large Language Models introduces a more subtle multi-turn strategy that conceals intent over several exchanges—like pretending to stop a bombing but asking for step-by-step instructions to build one. It works much better than older attack methods and shows that even advanced AI systems are vulnerable in more realistic, multi-turn conversations. The authors also propose a defense that keeps models safe without hurting performance.

Who Really Wrote This?

Abraham Israeli, Shuai Liu, Jonathan May,  David Jurgens
ACL Findings 2025

Authorship verification is key for spotting plagiarism, detecting impersonation, and identifying AI-generated content. The Million Authors Corpus: A Cross-Lingual and Cross-Domain Wikipedia Dataset for Authorship Verification introduces a massive new dataset with over 60 million Wikipedia text samples from 1.3 million users across dozens of languages. It’s the first benchmark of its kind to support large-scale, cross-lingual authorship analysis beyond English and narrow domains.

Can AI Keep Up in an Interview?

Alexander Spangher, Michael Lu, Sriya Kalyan, Hyundong Justin Cho, Tenghao Huang, Weiyan Shi, Jonathan May
ACL 2025 Main Conference

Real interviews require asking follow-ups, shifting topics, and planning several moves ahead. In their paper NewsInterview: A Dataset and a Playground to Evaluate LLMs’ Grounding Gap via Informational Interviews, researchers build a dataset of 40,000 interviews from NPR and CNN and shows that LLMs, while fluent, rarely pivot or ask deeper questions. A simulated environment helps probe where these models fall short, and how they might be improved for strategic multi-turn dialogue.

Can AI Understand Mime?

Hyundong Justin Cho, Spencer Lin, Tejas Srinivasan, Michael Saxon, Deuksin Kwon, Natali T. Chavez, Jonathan May
ACL Findings 2025

If someone mimes brushing their teeth or knocking on a door, most people understand instantly. Vision-language models don’t. The paper, Can Vision Language Models Understand Mimed Actions?, introduces a benchmark to test whether AI can interpret mimed actions, and finds that even top models perform far worse than humans. It shows how far we are from true multimodal understanding.

Full List of ISI-Affiliated Papers at ACL 2025

A Little Human Data Goes a Long Way
Dhananjay Ashok, Jonathan May

Can Vision Language Models Understand Mimed Actions?
Hyundong Justin Cho, Spencer Lin, Tejas Srinivasan, Michael Saxon, Deuksin Kwon, Natali T. Chavez, Jonathan May

CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization
Brihi Joshi, Sriram Venkatapathy, Mohit Bansal, Nanyun Peng, Haw-Shiuan Chang

ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations
Brihi Joshi, Keyu He, Sahana Ramnath, Sadra Sabouri, Kaitlyn Zhou, Souti Chattopadhyay, Swabha Swayamdipta, Xiang Ren

Mechanistic Interpretability of Emotion Inference in Large Language Models
Ala N. Tak, Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, Jonathan Gratch

NewsInterview: A Dataset and a Playground to Evaluate LLMs’ Grounding Gap via Informational Interviews
Alexander Spangher, Michael Lu, Sriya Kalyan, Hyundong Justin Cho, Tenghao Huang, Weiyan Shi, Jonathan May

Red Queen: Exposing Latent Multi-Turn Risks in Large Language Models
Yifan Jiang, Kriti Aggarwal, Tanmay Laud, Kashif Munir, Jay Pujara, Subhabrata Mukherjee

R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory
Tenghao Huang, Kinjal Basu, Ibrahim Abdelaziz, Pavan Kapanipathi, Jonathan May, Muhao Chen

Should I Trust You? Detecting Deception in Negotiations Using Counterfactual RL
Wichayaporn Wongkamjan, Yanze Wang, Feng Gu, Denis Peskoff, Jonathan K. Kummerfeld, Jonathan May, Jordan Lee Boyd-Graber

The Million Authors Corpus: A Cross-Lingual and Cross-Domain Wikipedia Dataset for Authorship Verification
Abraham Israeli, Shuai Liu, Jonathan May,  David Jurgens

Note: Every effort was made to include all ISI-affiliated papers at ICWSM25. If your paper was inadvertently left out, please let us know at [email protected] so the list can be updated.

Published on July 28th, 2025

Last updated on July 28th, 2025

Want to write about this story?