Title :
Data selection for acoustic emotion recognition: Analyzing and comparing utterance and sub-utterance selection strategies
Author :
Duc Le;Emily Mower Provost
Author_Institution :
Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA
Abstract :
Data selection is an important component of cross-corpus training and semi-supervised/active learning. However, its effect on acoustic emotion recognition is still not well understood. In this work, we perform an in-depth exploration of various data selection strategies for emotion classification from speech using classifier agreement as the selection metric. Our methods span both the traditional utterance as well as the less explored sub-utterance level. A median unweighted average recall of 70.68%, comparable to the winner of the 2009 INTERSPEECH Emotion Challenge, was achieved on the FAU Aibo 2-class problem using less than 50% of the training data. Our results indicate that sub-utterance selection leads to slightly faster convergence and significantly more stable learning. In addition, diversifying instances in terms of classifier agreement produces a faster learning rate, whereas selecting those near the median results in higher stability. We show that the selected data instances can be explained intuitively based on their acoustic properties and position within an utterance. Our work helps provide a deeper understanding of the strengths, weaknesses, and trade-offs of different data selection strategies for speech emotion recognition.
Keywords :
"Hidden Markov models","Training","Emotion recognition","Speech","Training data","Measurement","Acoustics"
Conference_Titel :
Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on
Electronic_ISBN :
2156-8111
DOI :
10.1109/ACII.2015.7344564