DocumentCode :
3131434
Title :
Training GMMs for speaker verification
Author :
Kelly, Finnian ; Harte, Naomi
Author_Institution :
Sigmedia Group, Department of Electronic and Electrical Engineering, Trinity College Dublin, Ireland
fYear :
2010
fDate :
23-24 June 2010
Firstpage :
163
Lastpage :
168
Abstract :
An established approach to training Gaussian Mixture Models (GMMs) for speaker verification is via the expectation-maximisation (EM) algorithm. The EM algorithm has been shown to be sensitive to initialisation and prone to converging on local maxima. In exploration of these issues, three different initialisation methods are implemented, along with a split and merge technique to ‘pull’ the trained GMM out of a local maxima. It is shown that both of these approaches improve the likelihood of a GMM trained on speech data. Results of a verification task on the TIMIT and YOHO databases show that increased model fit does not directly translate into an improved equivalent error (EER) rate. In no case does the split and merge procedure improve the EER rate. TIMIT results show a peak in performance of 4.8% EER at 20 EM iterations and a random GMM initialisation. An EER of 1.41% is achieved on the YOHO database under the same regime. It is concluded that running EM to the optimal point of convergence achieves best speaker verification performance, but that this optimal point is dependent on the data and model parameters.
Keywords :
Expectation Maximisation; Gaussian Mixture Model; Speaker Verification; Split and Merge;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Signals and Systems Conference (ISSC 2010), IET Irish
Conference_Location :
Cork
Type :
conf
DOI :
10.1049/cp.2010.0506
Filename :
5638424
Link To Document :
بازگشت