مرکز منطقه ای اطلاع رساني علوم و فناوري - Source-normalised-and-weighted LDA for robust speaker recognition using i-vectors

DocumentCode :

2179872

Title :

Source-normalised-and-weighted LDA for robust speaker recognition using i-vectors

Author :

McLaren, Mitchell ; Van Leeuwen, David

Author_Institution :

Centre for Language & Speech Technol., Radboud Univ., Nijmegen, Netherlands

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

5456

Lastpage :

5459

Abstract :

The recently developed i-vector framework for speaker recognition has set a new performance standard in the research field. An i-vector is a compact representation of a speaker utterance extracted from a low-dimensional total variability subspace. Prior to classification using a cosine kernel, i-vectors are projected into an LDA space in order to reduce inter-session variability and enhance speaker discrimination. The accurate estimation of this LDA space from a training dataset is crucial to classification performance. A typical training dataset, however, does not consist of utterances acquired from all sources of interest (ie., telephone, microphone and interview speech sources) for each speaker. This has the effect of introducing source-related variation in the between-speaker covariance matrix and results in an incomplete representation of the within-speaker scatter matrix used for LDA. Proposed is a novel source-normalised-and-weighted LDA algorithm developed to improve the robustness of i-vector-based speaker recognition under both mis-matched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. Evaluated on the recent NIST 2008 and 2010 Speaker Recognition Evaluations (SRE), the proposed technique demonstrated improvements of up to 31% in minimum DCF and EER under mis-matched and sparsely-resourced conditions.

Keywords :

S-matrix theory; speaker recognition; DCF; EER; LDA space; SRE; cosine kernel; i-vectors; inter-session variability reduce; robust speaker recognition evaluation; scatter matrix; source-normalised-and-weighted LDA; Covariance matrix; Microphones; NIST; Speaker recognition; Speech; Speech recognition; Training; i-vector; linear discriminant analysis; source variability; speaker recognition; total variability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5947593

Filename :

5947593

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2179872