Title :
Mandarin emotion recognition in speech
Author :
Pao, Tsang-Long ; Chen, Yu-Te
Author_Institution :
Dept. of Comput. Sci. & Eng., Tatung Univ., Taiwan
fDate :
30 Nov.-3 Dec. 2003
Abstract :
Humans interact with others in several ways, such as speech, gesture, eye contact etc. Among them, speech is the most effective way of communication through which people can readily exchange information without the need for any other tool. Emotions color the speech, and can make the meaning more complex and tell about how it is said. A Mandarin speech based emotion classification method is presented. Five basic human emotions, anger, boredom, happiness, neutral and sadness, are investigated. The features extracted include 16 LPC (linear predictive cepstrum) coefficients and 20 MFCC (Mel-frequency cepstral coefficients) components, and the presented recognizer is based on two statistical pattern recognition techniques, the minimum-distance method and the nearest class mean method. For minimum-distance emotion recognition, an average accuracy of 79.1% is obtained. For the nearest class mean emotion recognition, higher accuracy of 89.1% is achieved.
Keywords :
emotion recognition; human factors; natural languages; pattern classification; speech recognition; MFCC components; Mandarin emotion recognition; Mel-frequency cepstral coefficients; anger; boredom; emotion classification; feature extraction; happiness; linear predictive cepstrum coefficients; minimum-distance method; nearest class mean method; neutral; sadness; speech recognition; statistical pattern recognition techniques; Cepstral analysis; Cepstrum; Data mining; Emotion recognition; Feature extraction; Humans; Linear predictive coding; Mel frequency cepstral coefficient; Pattern recognition; Speech;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318445