Publications
Detecting paralinguistic events in audio stream using context in features and probabilistic decisions
Abstract
Non-verbal communication involves encoding, transmission and decoding of non-lexical cues and is realized using vocal (e.g. prosody) or visual (e.g. gaze, body language) channels during conversation. These cues perform the function of maintaining conversational flow, expressing emotions, and marking personality and interpersonal attitude. In particular, non-verbal cues in speech such as paralanguage and non-verbal vocal events (e.g. laughters, sighs, cries) are used to nuance meaning and convey emotions, mood and attitude. For instance, laughters are associated with affective expressions while fillers (e.g. um, ah, um) are used to hold floor during a conversation. In this paper we present an automatic non-verbal vocal events detection system focusing on the detect of laughter and fillers. We extend our system presented during Interspeech 2013 Social Signals Sub-challenge (that was the winning entry in …
- Date
- 2016
- Authors
- Rahul Gupta, Kartik Audhkhasi, Sungbok Lee, Shrikanth Narayanan
- Journal
- Computer speech & language
- Volume
- 36
- Pages
- 72-92
- Publisher
- Academic Press