DocumentCode :
1668807
Title :
Deep learning for robust feature generation in audiovisual emotion recognition
Author :
Yelin Kim ; Honglak Lee ; Provost, Emily Mower
Author_Institution :
Electr. Eng. & Comput. Sci., Univ. of Michigan, Ann Arbor, MI, USA
fYear :
2013
Firstpage :
3687
Lastpage :
3691
Abstract :
Automatic emotion recognition systems predict high-level affective content from low-level human-centered signal cues. These systems have seen great improvements in classification accuracy, due in part to advances in feature selection methods. However, many of these feature selection methods capture only linear relationships between features or alternatively require the use of labeled data. In this paper we focus on deep learning techniques, which can overcome these limitations by explicitly capturing complex non-linear feature interactions in multimodal data. We propose and evaluate a suite of Deep Belief Network models, and demonstrate that these models show improvement in emotion classification performance over baselines that do not employ deep learning. This suggests that the learned high-order non-linear relationships are effective for emotion recognition.
Keywords :
emotion recognition; learning (artificial intelligence); audiovisual emotion recognition; deep belief network models; deep learning techniques; emotion classification; feature selection methods; high-level affective content; low-level human-centered signal cues; multimodal data; robust feature generation; Accuracy; Acoustics; Emotion recognition; Speech; Speech processing; Speech recognition; Training; deep belief networks; deep learning; emotion classification; multimodal features; unsupervised feature learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638346
Filename :
6638346
Link To Document :
بازگشت