Publications
Department of Electrical Engineering, Speech Analysis and Interpretation Lab, and Integrated Media Systems Center University of Southern California, CA, USA
Abstract
On-line speaker indexing sequentially detects the points where a speaker identity changes in a multi-speaker au-dio stream, and classifies each speaker segment. This paper addresses two challenges: The first relates to monitoring which requires on-line processing. The second re-lates to the fact that the number/identity of the speakers is unknown. The indexing needs to be made in a unsuper-vised process. To address these issues, we apply a predetermined generic speaker-independent model set, Sample Speaker Model (SSM). This set can be useful for more accurate speaker modeling and clustering without requiring training models on target speaker data. Once a speakerindependent model is selected from the sample models, it is adapted into a speaker-dependent model progressively. Experiments were performed with Speaker Recognition Benchmark NIST Speech (1999). Results showed that our new technique, simulated using Markov Chain Monte Carlo Method, gave 92.47% indexing accuracy on telephone con-versation data.
- Date
- 2003
- Authors
- Soonil Kwon, Shrikanth Narayanan
- Journal
- Automatic Speech Recognition and Understanding
- Pages
- 423
- Publisher
- IEEE