DocumentCode
3583602
Title
Multimodal emotion recognition in audiovisual communication
Author
Schuller, Bj?¶rn ; Lang, Michael ; Rigoll, Gerhard
Author_Institution
Inst. for Human-Machine-Interaction, Tech. Univ. Munich, Germany
Volume
1
fYear
2002
fDate
6/24/1905 12:00:00 AM
Firstpage
745
Abstract
This paper discusses innovative techniques to automatically estimate a user´s emotional state analyzing the speech signal and haptical interaction on a touch-screen or via mouse. The knowledge of a user´s emotion permits adaptive strategies striving for a more natural and robust interaction. We classify seven emotional states: surprise, joy, anger, fear, disgust, sadness, and neutral user state. The user´s emotion is extracted by a parallel stochastic analysis of his spoken and haptical machine interactions while understanding the desired intention. The introduced methods are based on the common prosodic speech features pitch and energy, but rely also on the semantic and intention based features wording, degree of verbosity, temporal intention and word rate, and finally the history of user utterances. As further modality even touch-screen or mouse interaction is analyzed. The estimates based on these features are integrated in a multimodal way. The introduced methods are based on results of user studies. A realization proved to be reliable compared with subjective probands´ impressions.
Keywords
adaptive systems; emotion recognition; feature extraction; haptic interfaces; mouse controllers (computers); speaker recognition; speech-based user interfaces; stochastic processes; touch sensitive screens; adaptive strategies; anger; audiovisual communication; degree of verbosity; disgust; fear; haptical interaction; intention based features; joy; mouse interaction; multimodal emotion recognition; neutral user state; parallel stochastic analysis; pitch; prosodic speech features; sadness; semantic features; speech energy; speech signal analysis; surprise; temporal intention; touch screen; user utterance history; word rate; wording; Emotion recognition; Humans; Internet; Lungs; Mice; Natural languages; Signal analysis; Speech analysis; State estimation; Stochastic processes;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia and Expo, 2002. ICME '02. Proceedings. 2002 IEEE International Conference on
Print_ISBN
0-7803-7304-9
Type
conf
DOI
10.1109/ICME.2002.1035889
Filename
1035889
Link To Document