Speech data augmentation

Abstract

This application relates to data augmentation of audio, speech or multimodal information signals, and more particularly to such augmentation for speech emotion rec ognition.[0003] Data augmentation is a method for generating synthetic data for classification, tracking or recognition machine learning tasks. Data augmentation may be effective for machine learning and deep learning tasks where there are few training examples available or some labels are under represented in training (sparse data). Traditional data aug mentation techniques for audio, speech and multimodal processing applications have relied on perturbation of the speech signal in the time-and/or frequency-domain, eg, time-scale modification, pitch modification, vocal-tract length modification, or with modifying the recording con ditions under with the signal was recorded, eg, varying types and amounts of noise. Such data augmentation meth ods …

Date: 2020
Authors: G Paraskevopoulos, E Chatziagapi, T Giannakopoulos, A Potamianos, ...
Inventors: Georgios Paraskevopoulos, Evangelia Chatziagapi, Theodoros Giannakopoulos, Alexandros Potamianos, Shrikanth Narayanan
Patent_office: US
Application_number: 16852793

Information Sciences Institute

Publications

Speech data augmentation

Abstract