Acoustic-syntactic maximum entropy model for automatic prosody labeling

Abstract

In this paper we describe an automatic prosody labeling framework that exploits both language and speech information intended for the purpose of incorporating prosody within a speech-to-speech translation framework. We propose a maximum entropy syntactic- prosodic model that achieves an accuracy of 85.22% and 91.54% for pitch accent and boundary tone labeling on the Boston University Radio News corpus. We model the acoustic-prosodic stream with two different models, one a maximum entropy model and the other a traditional HMM. We finally couple the syntactic-prosodic and acoustic-prosodic components to achieve a pitch accent and boundary tone classification accuracy of 86.01% and 93.09% respectively.

Date: December 10, 2006
Authors: Vivek Rangarajan, Shrikanth Narayanan, Srinivas Bangalore
Conference: 2006 IEEE Spoken Language Technology Workshop
Pages: 74-77
Publisher: IEEE

View Paper

Information Sciences Institute

Publications

Acoustic-syntactic maximum entropy model for automatic prosody labeling

Abstract