Weight based super-GMM for speaker identification systems

Author

Garcia, Guillermo ; Eriksson, Thomas

Author_Institution

Dept. of Signals & Syst., Chalmers Univ. of Technol., Göteborg, Sweden

fYear

2008

fDate

25-29 Aug. 2008

Firstpage

1

Lastpage

5

Abstract

Gaussian Mixture Models (GMMs) are widely employed as statistical models in biometric systems. In speaker identification (SID) systems, GMMs have shown their effectiveness for modeling speaker identities. However, an increase on the number of enrolled speakers reduces the interspeaker variability causing the degradation on the performance of the recognizer. In this work, we propose a speaker super-GMM which deals with the interspeaker distance by access to a larger number of GMMs but maintaining the same complexity as the baseline system. The super-GMM is constructed by the concatenation of all the speaker GMMs enrolled in the database and weighting each GMM component. These weights contain discriminative information used to determine each speaker model. To train the super-GMM, we train the weights of each mixture component using a variation of the expectation maximization (EM) algorithm which only updates the weights of the super-GMMs. Then, we apply a minimum classification error (MCE) approach to enhance the discriminative properties of the weights. Our approach has shown approximately 20% improvement on the performance (probability of error) compared to the baseline system.

Keywords

Gaussian processes; expectation-maximisation algorithm; mixture models; signal classification; speaker recognition; EM algorithm; Gaussian mixture models; MCE approach; SID systems; biometric systems; expectation maximization algorithm; interspeaker variability; minimum classification error approach; speaker identification systems; speaker identity modeling; statistical models; weight based super-GMM; Complexity theory; Databases; Feature extraction; Maximum likelihood estimation; Speaker recognition; Speech; Training; Discriminative methods; Gaussian distributions; estimation; modeling; speaker recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Conference, 2008 16th European

Conference_Location

Lausanne

ISSN

2219-5491

Type

conf

Filename

7080729