DocumentCode
417217
Title
Automatic emotional speech classification
Author
Ververidis, Dimitrios ; Kotropoulos, Constantine ; Pitas, Ioannis
Author_Institution
Dept. of Informatics, Aristotle Univ., Thessaloniki, Greece
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
Our purpose is to design a useful tool which can be used in psychology to automatically classify utterances into five emotional states such as anger, happiness, neutral, sadness, and surprise. The major contribution of the paper is to rate the discriminating capability of a set of features for emotional speech recognition. A total of 87 features has been calculated over 500 utterances from the Danish Emotional Speech database. The sequential forward selection method (SFS) has been used in order to discover a set of 5 to 10 features which are able to classify the utterances in the best way. The criterion used in SFS is the cross-validated correct classification score of one of the following classifiers: nearest mean and Bayes classifier where class pdf are approximated via Parzen windows or modelled as Gaussians. After selecting the 5 best features, we reduce the dimensionality to two by applying principal component analysis. The result is a 51.6% ± 3% correct classification rate at 95% confidence interval for the five aforementioned emotions, whereas a random classification would give a correct classification rate of 20%. Furthermore, we find out those two-class emotion recognition problems whose error rates contribute heavily to the average error and we indicate that a possible reduction of the error rates reported in this paper would be achieved by employing two-class classifiers and combining them.
Keywords
Bayes methods; Gaussian distribution; emotion recognition; error statistics; feature extraction; pattern classification; principal component analysis; probability; psychology; speech recognition; Bayes classifier; Danish Emotional Speech database; Gaussian model; Parzen windows; anger; automatic emotional speech classification; class pdf; cross-validated correct classification score; discriminating capability; error rates; features; happiness; nearest mean classifier; neutral; principal component analysis; psychology; sadness; sequential forward selection method; surprise; two-class classifiers; utterances; Artificial intelligence; Electronic mail; Error analysis; Frequency estimation; Informatics; Information analysis; Laboratories; Spatial databases; Speech recognition; Virtual reality;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326055
Filename
1326055
Link To Document