Title :
Improved GMM-based language recognition using constrained MLLR transforms
Author :
Shen, Wade ; Reynolds, Douglas
Author_Institution :
Syst. & Technol. Group, MIT, Lexington, MA
fDate :
March 31 2008-April 4 2008
Abstract :
In this paper we describe the application of a feature-space transform based on constrained maximum likelihood linear regression for unsupervised compensation of channel and speaker variability to the language recognition problem. We show that use of such transforms can improve baseline GMM-based language recognition performance on the 2005 NIST Language Recognition Evaluation (LRE05) task by 38%. Furthermore, gains from CMLLR are additive with other modeling enhancements such as vocal tract length normalization (VTLN). Further improvement is obtained using discriminative training, and it is shown that a system using only CMLLR adaption produces state-of-the-art accuracy with decreased test-time computational cost than systems using VTLN.
Keywords :
Gaussian processes; maximum likelihood estimation; natural language processing; regression analysis; speech recognition; GMM-based language recognition; NIST Language Recognition Evaluation task; channel variability compensation; constrained MLLR transforms; constrained maximum likelihood linear regression; discriminative training; feature-space transform; speaker variability compensation; vocal tract length normalization; Argon; Automatic speech recognition; Information systems; Laboratories; Loudspeakers; Maximum likelihood linear regression; NIST; Natural languages; Speaker recognition; System testing; Adaptation; GMM; LID; Language Recognition; MMI; Maximum Likelihood Linear Regression;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518568