DocumentCode :
32010
Title :
Time-Frequency Feature and AMS-GMM Mask for Acoustic Emotion Classification
Author :
Zao, L. ; Cavalcante, D. ; Coelho, Rui
Author_Institution :
Grad. Program in Defense Eng., Mil. Inst. of Eng. (IME), Rio de Janeiro, Brazil
Volume :
21
Issue :
5
fYear :
2014
fDate :
May-14
Firstpage :
620
Lastpage :
624
Abstract :
In this letter, the pH time-frequency vocal source feature is proposed for multistyle emotion identification. A binary acoustic mask is also used to improve the emotion classification accuracy. Emotional and stress conditions from the Berlin Database of Emotional Speech (EMO-DB) and Speech under Simulated and Actual Stress (SUSAS) databases are investigated in the experiments. In terms of emotion identification rates, the pH outperforms the mel-frequency cepstral coefficients (MFCC) and a Teager-Energy-Operator (TEO) based feature. Moreover, the acoustic mask achieves accuracy improvement for both the MFCC and the pH feature.
Keywords :
Gaussian processes; amplitude modulation; emotion recognition; signal classification; time-frequency analysis; AMS-GMM mask; Berlin database of emotional speech database; EMO-DB database; Gaussian mixture models; MFCC; SUSAS database; acoustic emotion classification; amplitude modulation spectrogram; binary acoustic mask; mel-frequency cepstral coefficients; multistyle emotion identification; pH time-frequency vocal source feature; speech under simulated and actual stress databases; teager energy operator; Acoustics; Databases; Discrete wavelet transforms; Feature extraction; Speech; Time-frequency analysis; Vectors; Binary acoustic mask; Hurst exponent; pH feature; speech emotion recognition;
fLanguage :
English
Journal_Title :
Signal Processing Letters, IEEE
Publisher :
ieee
ISSN :
1070-9908
Type :
jour
DOI :
10.1109/LSP.2014.2311435
Filename :
6766238
Link To Document :
بازگشت