Collapse of Dense Retrievers & Steering MoE LLMs

ISI Natural Language Seminar

When

Tuesday, October 21, 2025 1:00pm - 2:00pm PDT

Add to calendar:

Presenter

Presented by:

Mohsen Fayyaz, UCLA

Location

Conference Rm #1135 in-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via Zoom.

This event is open to:

Everyone

Event Details

Location: CR#1135 Conference Room ISI-MDR

Speaker: Mohsen Fayyaz, UCLA

Abstract: This presentation explores two critical vulnerabilities in modern retrieval and language models. First, we highlight how dense retrievers collapse under heuristic biases, favoring shorter, early, and literal matches over factual evidence, leading to severe downstream failures in Retrieval-Augmented Generation. Second, we introduce SteerMoE, a framework for controlling Mixture-of-Experts LLMs by selectively activating or deactivating behavior-linked experts, enabling improved safety and faithfulness but also exposing risks of unsafe steering. Together, these works reveal both weaknesses and opportunities in building more robust, controllable AI systems.

Join Zoom Meeting
https://usc.zoom.us/j/92751744356?pwd=KgeSGQnYX7sCYBjIBXGzvqw3vW5pAs.1

Meeting ID: 927 5174 4356
Passcode: 2025

Speaker Bio

Mohsen Fayyaz is a third year Ph.D. student in Computer Science at UCLA, advised by Prof. Nanyun (Violet) Peng. His research focuses on Natural Language Processing, with interests in interpretability, explainability, and robustness. He has published work on token attribution, metaphors in pre-trained models, probing, dataset pruning, rationale explanations, dense retriever robustness, and steering Mixture-of-Experts LLMs.

For more information on the NL Seminar series and upcoming talks, please visit:

https://www.isi.edu/research-groups-nlg/nlg-seminars/

Hosts: Jonathan May and Dhananjay Ashok

This program is open to all eligible individuals. Information Sciences Institute operates all of its programs and activities consistent with the University’s Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.