Publications
Emotion to emotion speech conversion in phoneme level
Abstract
Having an ability to synthesize emotional speech can make human–machine interaction more natural in spoken dialogue management. This study investigates the effectiveness of prosodic and spectral modification in phoneme level on emotion‐to‐emotion speech conversion. The prosody modification is performed with the TD‐PSOLA algorithm (Moulines and Charpentier, 1990). We also transform the spectral envelopes of source phonemes to match those of target phonemes using LPC‐based spectral transformation approach (Kain, 2001). Prosodic speech parameters (F0, duration, and energy) for target phonemes are estimated from the statistics obtained from the analysis of an emotional speech database of happy, angry, sad, and neutral utterances collected from actors. Listening experiments conducted with native American English speakers indicate that the modification of prosody only or spectrum only is not …
- Date
- 2004
- Authors
- Murtaza Bulut, Serdar Yildirim, Carlos Busso, Chul Min Lee, Ebrahim Kazemzadeh, Sungbok Lee, Shrikanth Narayanan
- Journal
- The Journal of the Acoustical Society of America
- Volume
- 116
- Issue
- 4_Supplement
- Pages
- 2481-2481
- Publisher
- Acoustical Society of America