DocumentCode
1863532
Title
Hidden Markov model-based speech emotion recognition
Author
Schuller, Bjorn ; Rigoll, Gerhard ; Lang, Manfred
Author_Institution
Inst. for Human-Comput. Commun., Technische Univ. Munchen, Germany
Volume
1
fYear
2003
fDate
6-9 July 2003
Abstract
In this contribution we introduce speech emotion recognition by use of continuous hidden Markov models. Two methods are propagated and compared throughout the paper. Within the first method a global statistics framework of an utterance is classified by Gaussian mixture models using derived features of the raw pitch and energy contour of the speech signal. A second method introduces increased temporal complexity applying continuous hidden Markov models considering several states using low-level instantaneous features instead of global statistics. The paper addresses the design of working recognition engines and results achieved with respect to the alluded alternatives. A speech corpus consisting of acted and spontaneous emotion samples in German and English language is described in detail. Both engines have been tested and trained using this equivalent speech corpus. Results in recognition of seven discrete emotions exceeded 86% recognition rate. As a basis of comparison the similar judgment of human deciders classifying the same corpus at 79.8% recognition rate was analyzed.
Keywords
Gaussian processes; emotion recognition; hidden Markov models; speech recognition; Gaussian mixture models; energy contour; global statistics; hidden Markov models; raw pitch; recognition rate; speech corpus; speech emotion recognition; speech signal; utterance; Data mining; Emotion recognition; Engines; Hidden Markov models; Humans; Natural languages; Speech analysis; Speech processing; Statistics; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
Print_ISBN
0-7803-7965-9
Type
conf
DOI
10.1109/ICME.2003.1220939
Filename
1220939
Link To Document