DocumentCode :
3428543
Title :
Nearest neighbor discriminant analysis for language recognition
Author :
Sadjadi, Seyed Omid ; Pelecanos, Jason W. ; Ganapathy, Sriram
Author_Institution :
Watson Group, IBM Res., Yorktown Heights, NY, USA
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
4205
Lastpage :
4209
Abstract :
Many state-of-the-art i-vector based voice biometric systems use linear discriminant analysis (LDA) as a post-processing stage to increase the computational efficiency in the back-end via dimensionality reduction, as well as annihilate the undesired (noisy) directions in the total variability subspace. The traditional approach for computing the LDA transform uses parametric representations for both intra- and inter-class scatter matrices that are based on the Gaussian distribution assumption. However, it is known that the actual distribution of i-vectors may not necessarily be Gaussian, and in particular, in the presence of noise and channel distortions. In addition, the rank of the LDA projection (i.e., the maximum number of available discriminant bases) is limited to the number of classes minus 1. Accordingly, language recognition tasks on noisy data that involve only a few language classes receive limited or no benefit from the LDA post-processing. Motivated by this observation, we present an alternative non-parametric discriminant analysis (NDA) technique that measures both the within- and between-language variation on a local basis using the nearest neighbor rule. The effectiveness of the NDA method is evaluated in the context of noisy language recognition tasks using speech material from the DARPA Robust Automatic Transcription of Speech (RATS) program. Experimental results indicate that NDA is more effective than the traditional parametric LDA for language recognition under noisy and channel degraded conditions.
Keywords :
Gaussian distribution; matrix algebra; speech recognition; statistical analysis; vectors; DARPA Robust Automatic Transcription of Speech program; Gaussian distribution assumption; LDA transform; NDA; actual i-vector distribution; between-language variation; computational efficiency; dimensionality reduction; i-vector based voice biometric systems; interclass scatter matrices; intraclass scatter matrices; linear discriminant analysis; nearest neighbor discriminant analysis; noisy language recognition tasks; nonparametric discriminant analysis technique; parametric representations; total variability subspace; within-language variation; Kernel; Linear discriminant analysis; NIST; Polynomials; Speech; Support vector machines; Training; RATS; discriminant analysis; i-vector; language recognition; nearest neighbor;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178763
Filename :
7178763
Link To Document :
بازگشت