Publications
Pashto speech recognition with limited pronunciation lexicon
Abstract
Automatic speech recognition (ASR) for low resource languages continues to be a difficult problem. In particular, colloquial dialects of Arabic, Farsi, and Pashto pose significant challenges in pronunciation dictionary creation. Therefore, most state-of-the-art ASR engines rely on the grapheme-as-phoneme approach for creating pronunciation dictionaries in these languages. While the grapheme approach simplifies ASR training, it performs significantly worse than a system trained with a high-quality phonetic dictionary. In this paper, we explore two techniques for bridging the performance gap between the grapheme and the phonetic approaches, without requiring manual pronunciations for all the words in the training data. The first approach is based on learning letter-to-sound rules from a small set of manual pronunciations in Pashto, and the second approach uses a hybrid phoneme/grapheme representation for …
- Date
- March 14, 2010
- Authors
- Rohit Prasad, Stavros Tsakalidis, Ivan Bulyko, Chia-lin Kao, Prem Natarajan
- Conference
- 2010 IEEE International conference on acoustics, speech and signal processing
- Pages
- 5086-5089
- Publisher
- IEEE