VISTA researchers are developing core algorithms for automatic speech recognition (ASR) technology. Recent research has focused on building end-to-end neural speech recognition systems in multiple languages. End-to-end systems are more robust and offer the potential to approach human performance under some conditions.
With funding from multiple government research programs, we are developing deep-learning algorithms and approaches that allow us to train end-to-end speech recognition systems with very little supervision, which are robust to both background noise and accent variation.
Key areas covered by this work include: recurrent neural networks, sequence-to-sequence modeling, feature extraction and data augmentation.