مرکز منطقه ای اطلاع رساني علوم و فناوري - Nearest neighbor discriminant analysis for language recognition

DocumentCode :

3428543

Title :

Nearest neighbor discriminant analysis for language recognition

Author :

Sadjadi, Seyed Omid ; Pelecanos, Jason W. ; Ganapathy, Sriram

Author_Institution :

Watson Group, IBM Res., Yorktown Heights, NY, USA

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4205

Lastpage :

4209

Abstract :

Many state-of-the-art i-vector based voice biometric systems use linear discriminant analysis (LDA) as a post-processing stage to increase the computational efficiency in the back-end via dimensionality reduction, as well as annihilate the undesired (noisy) directions in the total variability subspace. The traditional approach for computing the LDA transform uses parametric representations for both intra- and inter-class scatter matrices that are based on the Gaussian distribution assumption. However, it is known that the actual distribution of i-vectors may not necessarily be Gaussian, and in particular, in the presence of noise and channel distortions. In addition, the rank of the LDA projection (i.e., the maximum number of available discriminant bases) is limited to the number of classes minus 1. Accordingly, language recognition tasks on noisy data that involve only a few language classes receive limited or no benefit from the LDA post-processing. Motivated by this observation, we present an alternative non-parametric discriminant analysis (NDA) technique that measures both the within- and between-language variation on a local basis using the nearest neighbor rule. The effectiveness of the NDA method is evaluated in the context of noisy language recognition tasks using speech material from the DARPA Robust Automatic Transcription of Speech (RATS) program. Experimental results indicate that NDA is more effective than the traditional parametric LDA for language recognition under noisy and channel degraded conditions.

Keywords :

Gaussian distribution; matrix algebra; speech recognition; statistical analysis; vectors; DARPA Robust Automatic Transcription of Speech program; Gaussian distribution assumption; LDA transform; NDA; actual i-vector distribution; between-language variation; computational efficiency; dimensionality reduction; i-vector based voice biometric systems; interclass scatter matrices; intraclass scatter matrices; linear discriminant analysis; nearest neighbor discriminant analysis; noisy language recognition tasks; nonparametric discriminant analysis technique; parametric representations; total variability subspace; within-language variation; Kernel; Linear discriminant analysis; NIST; Polynomials; Speech; Support vector machines; Training; RATS; discriminant analysis; i-vector; language recognition; nearest neighbor;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178763

Filename :

7178763

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3428543