CNMF-based acoustic features for noise-robust ASR

Abstract

We present an algorithm using convolutive non-negative matrix factorization (CNMF) to create noise-robust features for automatic speech recognition (ASR). Typically in noise-robust ASR, CNMF is used to remove noise from noisy speech prior to feature extraction. However, we find that denoising introduces distortion and artifacts, which can degrade ASR performance. Instead, we propose using the time-activation matrices from CNMF as acoustic model features. In this paper, we describe how to create speech and noise dictionaries that generate noise-robust time-activation matrices from noisy speech. Using the time-activation matrices created by our proposed algorithm, we achieve a 11.8% relative improvement in the word error rate on the Aurora 4 corpus compared to using log-mel filterbank energies. Furthermore, we attain a 13.8% relative improvement over log-mel filterbank energies when we combine them …

Date: 2016
Authors: Colin Vaz, Dimitrios Dimitriadis, Samuel Thomas, Shrikanth Narayanan
Conference: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Pages: 5735-5739
Publisher: IEEE

View Paper

Information Sciences Institute

Publications

CNMF-based acoustic features for noise-robust ASR

Abstract