Robust recognition of children's speech

Abstract

Developmental changes in speech production introduce age-dependent spectral and temporal variability in the speech signal produced by children. Such variabilities pose challenges for robust automatic recognition of children's speech. Through an analysis of age-related acoustic characteristics of children's speech in the context of automatic speech recognition (ASR), effects such as frequency scaling of spectral envelope parameters are demonstrated. Recognition experiments using acoustic models trained from adult speech and tested against speech from children of various ages clearly show performance degradation with decreasing age. On average, the word error rates are two to five times worse for children speech than for adult speech. Various techniques for improving ASR performance on children's speech are reported. A speaker normalization algorithm that combines frequency warping and model …

Date: January 1, 1970
Authors: Alexandros Potamianos, Shrikanth Narayanan
Journal: IEEE Transactions on speech and audio processing
Volume: 11
Issue: 6
Pages: 603-616
Publisher: IEEE

View Paper