مرکز منطقه ای اطلاع رساني علوم و فناوري - Emotion recognition from Mandarin speech signals

DocumentCode :

2839346

Title :

Emotion recognition from Mandarin speech signals

Author :

Tsang-Long Pao ; Chen, Yu-Te ; Yeh, Jun-Heng

Author_Institution :

Dept. of Comput. Sci. & Eng., Tatung Univ., Taipei, Taiwan

fYear :

2004

fDate :

15-18 Dec. 2004

Firstpage :

301

Lastpage :

304

Abstract :

In this paper, a Mandarin speech based emotion classification method is presented. Five primary human emotions including anger, boredom, happiness, neutral and sadness are investigated. In emotion classification of speech signals, the conventional features are statistics of fundamental frequency, loudness, duration and voice quality. However, the recognition accuracy of systems employing these features degrades substantially when more than two valence emotion categories are invoked. For speech emotion recognition, we select 16 LPC coefficients, 12 LPCC components, 16 LFPC components, 16 PLP coefficients, 20 MFCC components and jitter as the basic features to form the feature vector. A Mandarin corpus recorded by 12 non-professional speakers is employed. The recognizer presented in this paper is based on three recognition techniques: LDA, K-NN, and HMMs. Experimental results show that the selected features are robust and effective for emotion recognition, not only in the arousal dimension but also in the valence dimension.

Keywords :

cepstral analysis; emotion recognition; feature extraction; hidden Markov models; jitter; linear predictive coding; signal classification; HMM; K-NN; LDA recognition technique; LFPC components; LPC coefficients; LPCC components; MFCC components; Mandarin speech based emotion classification method; Mel-frequency cepstral coefficients; PLP coefficients; anger; arousal dimension; boredom; emotion recognition; feature extraction; feature vectors; happiness; jitter; linear prediction cepstral coefficients; linear predictive coding; log frequency power coefficients; neutral emotion; primary human emotions; sadness; valence emotion categories; Degradation; Emotion recognition; Humans; Jitter; Linear discriminant analysis; Linear predictive coding; Mel frequency cepstral coefficient; Robustness; Speech; Statistics;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing, 2004 International Symposium on

Print_ISBN :

0-7803-8678-7

Type :

conf

DOI :

10.1109/CHINSL.2004.1409646

Filename :

1409646

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2839346