DocumentCode
2704463
Title
Speech Emotion Recognition using Gaussian Mixture Vector Autoregressive Models
Author
El Ayadi, Moataz M. H. ; Kamel, Mohamed S. ; Karray, Fakhri
Author_Institution
Lab. of Pattern Anal. & Machine Intelligence, Waterloo Univ., Ont., Canada
Volume
4
fYear
2007
fDate
15-20 April 2007
Abstract
It is believed that modeling temporal structure of the speech data may be useful for the problem of speech emotion recognition (T. Nwe et al., 2003). In this paper, Gaussian mixture vector autoregressive model is proposed as a statistical classifier for this task. The main motivation behind using such a model is its ability to model the dependency among extracted speech feature vectors as well as the multi-modality in their distribution. When applied to the Berlin emotional speech database, the proposed technique provides a classification accuracy of 76% versus 71% for the hidden Markov model, 67% for the k-nearest neighbors, 55% for feed-forward neural networks. The model gives also better discrimination between high-arousal, low arousal, and neutral emotions than the HMM.
Keywords
Gaussian processes; autoregressive processes; emotion recognition; speech processing; speech recognition; statistical analysis; Berlin emotional speech database; Gaussian mixture vector autoregressive models; extracted speech feature vectors; speech emotion recognition; statistical classifier; Emotion recognition; Feature extraction; Hidden Markov models; Machine intelligence; Neural networks; Pattern analysis; Reactive power; Spatial databases; Speech analysis; Speech synthesis; Gaussian mixture models; expectation maximization algorithm; maximum likelihood estimation; speech emotion recognition; vector autoregressive models;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location
Honolulu, HI
ISSN
1520-6149
Print_ISBN
1-4244-0727-3
Type
conf
DOI
10.1109/ICASSP.2007.367230
Filename
4218261
Link To Document