Peft-ser: On the use of parameter efficient transfer learning approaches for speech emotion recognition using pre-trained speech models

Abstract

Many recent studies have focused on fine-tuning pretrained models for speech emotion recognition (SER), resulting in promising performance compared to traditional methods that rely largely on low-level, knowledge-inspired acoustic features. These pre-trained speech models learn general-purpose speech representations using self-supervised or weakly-supervised learning objectives from large-scale datasets. Despite the significant advances made in SER through the use of pre-trained architecture, fine-tuning these large pre-trained models for different datasets requires saving copies of entire weight parameters, rendering them impractical to deploy in real-world settings. As an alternative, this work explores parameter-efficient fine-tuning (PEFT) approaches for adapting pre-trained speech models for emotion recognition. Specifically, we evaluate the efficacy of adapter tuning, embedding prompt tuning, and …

Date: September 10, 2023
Authors: Tiantian Feng, Shrikanth Narayanan
Conference: 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII)
Pages: 1-8
Publisher: IEEE

View Paper

Information Sciences Institute

Publications

Peft-ser: On the use of parameter efficient transfer learning approaches for speech emotion recognition using pre-trained speech models

Abstract