DocumentCode
2576734
Title
Speech emotion classification with the combination of statistic features and temporal features
Author
Jiang, Dan-Ning ; Cai, Lian-Hong
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Volume
3
fYear
2004
fDate
27-30 June 2004
Firstpage
1967
Abstract
For classifying speech emotion, most previous systems used either statistical features or temporal features exclusively. However, these two distinct feature representations appear to be concerned with different aspects of emotion, and should be combined in the task. This work proposes a classification scheme that enables the combination of them both. In the scheme, GMM and HMM are first performed to model the statistical features and temporal features respectively. Then the GMM likelihoods and HMM likelihoods are used as features in a further procedure. Finally, a weighted Bayesian classifier and MLP are applied to accomplish the classification. Experiments on a Chinese speech corpus have demonstrated that the scheme could improve the classification accuracy greatly. More detailed analysis indicated that these two feature representations could compensate each other efficiently in the classification.
Keywords
Bayes methods; Gaussian distribution; emotion recognition; feature extraction; hidden Markov models; multilayer perceptrons; pattern classification; GMM likelihood; HMM likelihood; MLP; classification accuracy; feature representations; multiple-layer perceptron; speech emotion classification; statistical features; temporal features; weighted Bayesian classifier; Bayesian methods; Computer science; Coordinate measuring machines; Emotion recognition; Feature extraction; Hidden Markov models; Speech; Statistics; Support vector machine classification; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on
Print_ISBN
0-7803-8603-5
Type
conf
DOI
10.1109/ICME.2004.1394647
Filename
1394647
Link To Document