Seminars and Events
Collapse of Dense Retrievers & Steering MoE LLMs
Event Details
Location: CR#1135 Conference Room ISI-MDR
Speaker: Mohsen Fayyaz, UCLA
Abstract: This presentation explores two critical vulnerabilities in modern retrieval and language models. First, we highlight how dense retrievers collapse under heuristic biases, favoring shorter, early, and literal matches over factual evidence, leading to severe downstream failures in Retrieval-Augmented Generation. Second, we introduce SteerMoE, a framework for controlling Mixture-of-Experts LLMs by selectively activating or deactivating behavior-linked experts, enabling improved safety and faithfulness but also exposing risks of unsafe steering. Together, these works reveal both weaknesses and opportunities in building more robust, controllable AI systems.
Join Zoom Meeting
https://usc.zoom.us/j/92751744356?pwd=KgeSGQnYX7sCYBjIBXGzvqw3vW5pAs.1
Meeting ID: 927 5174 4356
Passcode: 2025
Speaker Bio
Mohsen Fayyaz is a third year Ph.D. student in Computer Science at UCLA, advised by Prof. Nanyun (Violet) Peng. His research focuses on Natural Language Processing, with interests in interpretability, explainability, and robustness. He has published work on token attribution, metaphors in pre-trained models, probing, dataset pruning, rationale explanations, dense retriever robustness, and steering Mixture-of-Experts LLMs.
For more information on the NL Seminar series and upcoming talks, please visit:
https://www.isi.edu/research-groups-nlg/nlg-seminars/
Hosts: Jonathan May and Dhananjay Ashok