DocumentCode :
3430642
Title :
Speech emotion recognition with acoustic and lexical features
Author :
Qin Jin ; Chengxin Li ; Shizhe Chen ; Huimin Wu
Author_Institution :
Comput. Sci. Dept., Renmin Univ. of China, Beijing, China
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
4749
Lastpage :
4753
Abstract :
In this paper we explore one of the key aspects in building an emotion recognition system: generating suitable feature representations. We generate feature representations from both acoustic and lexical levels. At the acoustic level, we first extract low-level features such as intensity, F0, jitter, shimmer and spectral contours etc. We then generate different acoustic feature representations based on these low-level features, including statistics over these features, a new representation derived from a set of low-level acoustic codewords, and a new representation from Gaussian Supervectors. At the lexical level, we propose a new feature representation named emotion vector (eVector). We also use the traditional Bag-of-Words (BoW) feature. We apply these feature representations for emotion recognition and compare their performance on the USC-IEMOCAP database. We also combine these different feature representations via early fusion and late fusion. Our experimental results show that late fusion of both acoustic and lexical features achieves four-class emotion recognition accuracy of 69.2%.
Keywords :
computational linguistics; emotion recognition; signal representation; speech processing; speech recognition; Gaussian supervectors; USC-IEMOCAP database; acoustic feature representations; acoustic features; bag of words feature; eVector; emotion recognition system; emotion vector; lexical feature representation; lexical features; low level acoustic codewords; low level features; speech emotion recognition; speech fundamental frequency; speech intensity; speech jitter; speech shimmer; speech spectral contours; Accuracy; Acoustics; Emotion recognition; Feature extraction; Speech; Speech recognition; Support vector machines; Acoustic features; Emotion lexicon; Emotion recognition; Lexical features; Support vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178872
Filename :
7178872
Link To Document :
بازگشت