DocumentCode
2799252
Title
Maximum a posteriori linear regression for speaker recognition
Author
Zhang, Xiang ; Wang, Haipeng ; Xiao, Xiang ; Zhang, Jianping ; Yan, Yonghong
Author_Institution
ThinkIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences, Beijing, China
fYear
2010
fDate
14-19 March 2010
Firstpage
4542
Lastpage
4545
Abstract
Recently, using maximum likelihood linear regression (MLLR) transforms as the features for SVM based speaker recognition has been proposed. This can achieve performance comparable to that obtained with state-of-the-art approaches. In this paper, we focus on calculating the transforms based on a GMM universal background model (UBM). Rather than estimating the transforms using maximum likelihood criterion, we describe a new feature extraction technique for speaker recognition based on maximum a posteriori linear regression (MAPLR). This work is enriched by a proposed multi-class technique, which clusters the Gaussian mixtures into regression classes and estimates a different transform for each class. All the transforms of all the classes for a given utterance are concatenated into a supervector for SVM classification. Experiments on a NIST 2008 SRE corpus show that the speaker recognition system using MAPLR outperforms MLLR, and the multi-class approach can also bring significant gains for MAPLR system.
Keywords
MAPLR; MLLR; SVM; Speaker recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX, USA
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5495579
Filename
5495579
Link To Document