Mel frequency spectral domain defenses against adversarial attacks on speech recognition systems

Abstract

Automatic speech recognition (ASR) systems are vulnerable to adversarial attacks due to their reliance on machine learning models. Many of the defenses explored for defending ASR systems simply adapt defense approaches developed for the image domain. This paper explores speech-specific defenses in the feature domain and introduces a defense method called mel domain noise flooding (MDNF). MDNF injects additive noise to the mel spectrogram speech representation prior to re-synthesizing the audio signal input to ASR. The defense is evaluated against strong white-box threat models and shows competitive robustness.

Date: 2023
Authors: Nicholas Mehlman, Anirudh Sreeram, Raghuveer Peri, Shrikanth Narayanan
Journal: JASA Express Letters
Volume: 3
Issue: 3
Publisher: AIP Publishing

View Paper

Information Sciences Institute

Publications

Mel frequency spectral domain defenses against adversarial attacks on speech recognition systems

Abstract