Department of Electrical Engineering, Speech Analysis and Interpretation Lab, and Integrated Media Systems Center University of Southern California, CA, USA

Abstract

On-line speaker indexing sequentially detects the points where a speaker identity changes in a multi-speaker au-dio stream, and classifies each speaker segment. This paper addresses two challenges: The first relates to monitoring which requires on-line processing. The second re-lates to the fact that the number/identity of the speakers is unknown. The indexing needs to be made in a unsuper-vised process. To address these issues, we apply a predetermined generic speaker-independent model set, Sample Speaker Model (SSM). This set can be useful for more accurate speaker modeling and clustering without requiring training models on target speaker data. Once a speakerindependent model is selected from the sample models, it is adapted into a speaker-dependent model progressively. Experiments were performed with Speaker Recognition Benchmark NIST Speech (1999). Results showed that our new technique, simulated using Markov Chain Monte Carlo Method, gave 92.47% indexing accuracy on telephone con-versation data.

Date: 2003
Authors: Soonil Kwon, Shrikanth Narayanan
Journal: Automatic Speech Recognition and Understanding
Pages: 423
Publisher: IEEE

View Paper

Information Sciences Institute

Publications

Department of Electrical Engineering, Speech Analysis and Interpretation Lab, and Integrated Media Systems Center University of Southern California, CA, USA

Abstract