Title :
Speech emotion recognition based on HMM and SVM
Author :
Lin, Yi-Lin ; Wei, Gang
Author_Institution :
Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou, China
Abstract :
Automatic emotion recognition in speech is a current research area with a wide range of applications in human-machine interactions. This paper uses two classification methods, the hidden Markov model (HMM) and the support vector machine (SVM), to classify five emotional states: anger, happiness, sadness, surprise and a neutral state. In the HMM method, 39 candidate instantaneous features were extracted, and the sequential forward selection (SFS) method was used to find the best feature subset. The classification performance of the selected feature subset was then compared with that of the Mel frequency cepstrum coefficients (MFCC). Within the method based on SVM, a new vector measuring the difference between Mel frequency scale sub-bands energies is proposed. The performance of the K-nearest neighbors (KNN) classifier using the proposed vector was also investigated. Both gender dependent and gender independent experiments were conducted on the Danish emotional speech (DES) database. The recognition rates by the HMM classifier were 98.9% for female subjects, 100% for male subjects, and 99.5% for gender independent cases. When the SVM classifier and the proposed feature vector were employed, correct classification rates of 89.4%, 93.6% and 88.9% were obtained for male, female and gender independent cases respectively.
Keywords :
emotion recognition; feature extraction; hidden Markov models; human computer interaction; pattern classification; speech recognition; support vector machines; Danish emotional speech database; HMM; K-nearest neighbors classifier; Mel energy spectrum dynamics coefficients; SFS method; SVM; automatic speech emotion recognition; candidate instantaneous feature extraction; hidden Markov model; human-machine interactions; pattern classification methods; sequential forward selection; support vector machine; Cepstrum; Emotion recognition; Energy measurement; Feature extraction; Hidden Markov models; Man machine systems; Mel frequency cepstral coefficient; Speech; Support vector machine classification; Support vector machines; Emotion recognition; Hidden Markov Model; Mel energy spectrum dynamics coefficients; Sequential Forward Selection; Support Vector Machine;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527805