Seminars and Events

ISI Natural Language Seminar

Collapse of Dense Retrievers & Steering MoE LLMs

Event Details

Location: CR#1135 Conference Room ISI-MDR

Speaker: Mohsen Fayyaz, UCLA

Abstract: This presentation explores two critical vulnerabilities in modern retrieval and language models. First, we highlight how dense retrievers collapse under heuristic biases, favoring shorter, early, and literal matches over factual evidence, leading to severe downstream failures in Retrieval-Augmented Generation. Second, we introduce SteerMoE, a framework for controlling Mixture-of-Experts LLMs by selectively activating or deactivating behavior-linked experts, enabling improved safety and faithfulness but also exposing risks of unsafe steering. Together, these works reveal both weaknesses and opportunities in building more robust, controllable AI systems.

Join Zoom Meeting
https://usc.zoom.us/j/92751744356?pwd=KgeSGQnYX7sCYBjIBXGzvqw3vW5pAs.1

Meeting ID: 927 5174 4356
Passcode: 2025

Speaker Bio

Mohsen Fayyaz is a third year Ph.D. student in Computer Science at UCLA, advised by Prof. Nanyun (Violet) Peng. His research focuses on Natural Language Processing, with interests in interpretability, explainability, and robustness. He has published work on token attribution, metaphors in pre-trained models, probing, dataset pruning, rationale explanations, dense retriever robustness, and steering Mixture-of-Experts LLMs.

For more information on the NL Seminar series and upcoming talks, please visit:

https://www.isi.edu/research-groups-nlg/nlg-seminars/

Hosts: Jonathan May and Dhananjay Ashok