Title :
Intersession variability compensation for language detection
Author :
Xi Zhou;Jiri Navratil;Jason W. Pelecanos;Ganesh N. Ramaswamy;Thomas S. Huang
Author_Institution :
Dept. of ECE, University of Illinois at Urbana-Champaign (UIUC), 61801, USA
Abstract :
Gaussian mixture models (GMM) have become one of the standard acoustic approaches for Language Detection. These models are typically incorporated to produce a log-likelihood ratio (LLR) verification statistic. In this framework, the intersession variability within each language becomes an adverse factor degrading the accuracy. To address this problem, we formulate the LLR as a function of the GMM parameters concatenated into normalized mean supervectors, and estimate the distribution of each language in this (high dimensional) supervector space. The goal is to de-emphasize the directions with the largest intersession variability. We compare this method with two other popular intersession variability compensation methods known as Nuisance Attribute Projection (NAP) and Within-Class Covariance Normalization (WCCN). Experiments on the NIST LRE 2003 and NIST LRE 2005 speech corpora show that the presented technique reduces the error by 50% relative to the baseline, and performs competitively with the NAP and WCCN approaches. Fusion results with a phonotactic component are also presented.
Keywords :
"Support vector machines","NIST","Concatenated codes","Kernel","Acoustic signal detection","Speech","Testing","Databases","Speaker recognition","Support vector machine classification"
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
2379-190X
DOI :
10.1109/ICASSP.2008.4518570