DocumentCode
2178795
Title
Soft frame margin estimation of Gaussian Mixture Models for speaker recognition with sparse training data
Author
Yin, Yan ; Li, Qi
Author_Institution
Li Creative Technol., Inc., Florham Park, NJ, USA
fYear
2011
fDate
22-27 May 2011
Firstpage
5268
Lastpage
5271
Abstract
Discriminative Training (DT) methods for acoustic modeling, such as MMI, MCE, and SVM, have been proved effective in speaker recognition. In this paper we propose a DT method for GMM using soft frame margin estimation. Unlike other DT methods such as MMI or MCE, the soft frame margin estimation attempts to enhance the generalization capability of GMM to unseen data in case the mismatch exists between training data and unseen data. We define an objective function which integrates multi-class separation frame margin and loss function, both as functions of GMM likelihoods. We propose to optimize the objective function based on a convex optimization technique, semidefinite programming. As shown in our experimental results, the proposed soft frame margin discriminative training with semidefinite programming optimization (SFME-SDP) is very effective for robust speaker model training when only limited amounts of training data are available.
Keywords
Gaussian processes; convex programming; speaker recognition; DT method; GMM likelihoods; Gaussian mixture models; MCE; MMI; SVM; acoustic modeling; convex optimization technique; discriminative training method; semidefinite programming; soft frame margin estimation; sparse training data; speaker recognition; Conferences; Convex functions; Estimation; Hidden Markov models; Speech; Support vector machines; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location
Prague
ISSN
1520-6149
Print_ISBN
978-1-4577-0538-0
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2011.5947546
Filename
5947546
Link To Document