Name: NL Seminar-Do Androids Know They're Only Dreaming of Electric Sheep?
Start: 2024-03-18T04:00:00-07:00
End: 2024-03-18T05:00:00-07:00

ISI Natural Language Seminar

NL Seminar-Do Androids Know They're Only Dreaming of Electric Sheep?

When

Monday, March 18, 2024 11:00am - 12:00pm PDT

Add to calendar:

Presenter

Presented by:

Sky Wang, Columbia University

Location

Conference Rm #689 in-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via the zoom link.

Virtual URL

Online Link

Virtual Recording

This event is open to:

Everyone

Event Details

Speaker: Sky Wang, Columbia University

Conference Rm Location: ISI-MDR #689 in-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via Zoom

This talk will be recorded and shared publicly on ISI Youtube channel at a future date in 2024.

REMINDER:

If you do not have access to the 6th Floor, please check in at the main reception desk on 10th floor and someone will escort you to the conference room location prior to the start of the talk.

Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom.

If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually.

For more information on the NL Seminar series and upcoming talks, please visit:

https://www.isi.edu/research-groups-nlg/nlg-seminars/

Hosts: Jon May and Justin Cho

Abstract:

We design probes trained on the internal representations of a transformer language model that are predictive of its hallucinatory behavior on in-context generation tasks. To facilitate this detection, we create a span-annotated dataset of organic and synthetic hallucinations over several tasks. We find that probes trained on the force-decoded states of synthetic hallucinations are generally ecologically invalid in organic hallucination detection. Furthermore, hidden state information about hallucination appears to be task and distribution-dependent. Intrinsic and extrinsic hallucination saliency varies across layers, hidden state types, and tasks; notably, extrinsic hallucinations tend to be more salient in a transformer’s internal representations. Outperforming multiple contemporary baselines, we show that probing is a feasible and efficient alternative to language model hallucination evaluation when model states are available.

Speaker Bio

Sky is a Ph.D. candidate in Computer Science at Columbia University advised by Zhou Yu and Smaranda Muresan. His research primarily revolves around Natural Language Processing (NLP), with broad interests in the area where NLP meets Computational Social Science (CSS). Here, his research primarily revolves around three major areas: (1) revealing and designing for social difference and inequality, (2) cross-cultural NLP, and (3) mechanistic interpretability. His research is supported by a NSF Graduate Research Fellowship and has received two outstanding paper awards at EMNLP. He has previously been an intern at Microsoft Semantic Machines, Google Research, and Amazon AWS AI.

If speaker approves to be recorded for this NL Seminar talk, it will be posted on our USC/ISI YouTube page within 1-2 business days: https://www.youtube.com/user/USCISI.

Subscribe here to learn more about upcoming seminars: https://www.isi.edu/events/

This program is open to all eligible individuals. Information Sciences Institute operates all of its programs and activities consistent with the University’s Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.