DocumentCode
134271
Title
Speech emotion classification using acoustic features
Author
Shizhe Chen ; Qin Jin ; Xirong Li ; Gang Yang ; Jieping Xu
Author_Institution
Multimedia Comput. Lab., Renmin Univ. of China, Beijing, China
fYear
2014
fDate
12-14 Sept. 2014
Firstpage
579
Lastpage
583
Abstract
Emotion recognition from speech is a challenging research area with wide applications. In this paper we explore one of the key aspects of building an emotion recognition system: generating suitable feature representation. We extract features from four angles: (1) low-level acoustic features such as intensity, F0, jitter, shimmer and spectral contours etc. and statistical functions over these features, (2) a set of features derived from segmental cepstral-based features scored against emotion-dependent Gaussian mixture models, (3) a set of features derived from a set of low-level acoustic codewords and (4) GMM supervectors constructed by stacking the means or covariance or weights of the adapted mixture components on each utterance. We apply these features for emotion recognition independently and jointly and compare their performance within this task. We build a support vector machine (SVM) classifier based on these features on the IEMOCAP database. The four-class emotion recognition accuracy of 71.9% of our system outperforms the previously reported best results on this dataset.
Keywords
Gaussian processes; acoustic signal processing; cepstral analysis; emotion recognition; feature extraction; mixture models; signal classification; speech recognition; support vector machines; GMM supervectors; IEMOCAP database; SVM classifier; emotion recognition system; emotion-dependent Gaussian mixture models; feature extraction; feature representation; four-class emotion recognition accuracy; low-level acoustic codewords; low-level acoustic features; segmental cepstral-based features; speech emotion classification; statistical functions; support vector machine; Accuracy; Acoustics; Emotion recognition; Feature extraction; Speech; Speech recognition; Support vector machines; Acoustic features; Emotion recognition; Support vector machine;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location
Singapore
Type
conf
DOI
10.1109/ISCSLP.2014.6936664
Filename
6936664
Link To Document