DocumentCode :
2839346
Title :
Emotion recognition from Mandarin speech signals
Author :
Tsang-Long Pao ; Chen, Yu-Te ; Yeh, Jun-Heng
Author_Institution :
Dept. of Comput. Sci. & Eng., Tatung Univ., Taipei, Taiwan
fYear :
2004
fDate :
15-18 Dec. 2004
Firstpage :
301
Lastpage :
304
Abstract :
In this paper, a Mandarin speech based emotion classification method is presented. Five primary human emotions including anger, boredom, happiness, neutral and sadness are investigated. In emotion classification of speech signals, the conventional features are statistics of fundamental frequency, loudness, duration and voice quality. However, the recognition accuracy of systems employing these features degrades substantially when more than two valence emotion categories are invoked. For speech emotion recognition, we select 16 LPC coefficients, 12 LPCC components, 16 LFPC components, 16 PLP coefficients, 20 MFCC components and jitter as the basic features to form the feature vector. A Mandarin corpus recorded by 12 non-professional speakers is employed. The recognizer presented in this paper is based on three recognition techniques: LDA, K-NN, and HMMs. Experimental results show that the selected features are robust and effective for emotion recognition, not only in the arousal dimension but also in the valence dimension.
Keywords :
cepstral analysis; emotion recognition; feature extraction; hidden Markov models; jitter; linear predictive coding; signal classification; HMM; K-NN; LDA recognition technique; LFPC components; LPC coefficients; LPCC components; MFCC components; Mandarin speech based emotion classification method; Mel-frequency cepstral coefficients; PLP coefficients; anger; arousal dimension; boredom; emotion recognition; feature extraction; feature vectors; happiness; jitter; linear prediction cepstral coefficients; linear predictive coding; log frequency power coefficients; neutral emotion; primary human emotions; sadness; valence emotion categories; Degradation; Emotion recognition; Humans; Jitter; Linear discriminant analysis; Linear predictive coding; Mel frequency cepstral coefficient; Robustness; Speech; Statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN :
0-7803-8678-7
Type :
conf
DOI :
10.1109/CHINSL.2004.1409646
Filename :
1409646
Link To Document :
بازگشت