Title :
Speaker states recognition using latent factor analysis based Eigenchannel factor vector modeling
Author :
Li, Ming ; Metallinou, Angeliki ; Bone, Daniel ; Narayanan, Shrikanth
Author_Institution :
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
Abstract :
This paper presents an automatic speaker state recognition approach which models the factor vectors in the latent factor analysis framework improving upon the Gaussian Mixture Model (GMM) baseline performance. We investigate both intoxicated and affective speaker states. We consider the affective speech signal as the original normal average speech signal being corrupted by the affective channel effects. Rather than reducing the channel variability to enhance the robustness as in the speaker verification task, we directly model the speaker state on the channel factors under the factor analysis framework. In this work, the speaker state factor vectors are extracted and modeled by the latent factor analysis approach in the GMM modeling framework and support vector machine classification method. Experimental results show that the proposed speaker state factor vector modeling system achieved 5.34% and 1.49% unweighted accuracy improvement over the GMM baseline on the intoxicated speech detection task (Alcohol Language Corpus) and the emotion recognition task (IEMOCAP database), respectively.
Keywords :
Gaussian processes; eigenvalues and eigenfunctions; emotion recognition; speaker recognition; support vector machines; GMM modeling framework; Gaussian mixture model baseline performance; IEMOCAP database; SVM; affective channel effects; affective speech signal; alcohol language corpus; automatic speaker state recognition approach; eigenchannel factor vector modeling; emotion recognition task; intoxicated speech detection task; latent factor analysis framework; normal average speech signal; speaker verification task; support vector machine classification method; Accuracy; Adaptation models; Analytical models; Covariance matrix; Hidden Markov models; Speech; Vectors; Emotion recognition; Latent factor analysis; Speaker state recognition; Supervector modeling;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288284