Fusion of SNR-dependent PLDA models for noise robust speaker verification

Author

Xiaomin Pang ; Man-Wai Mak

Author_Institution

Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Hong Kong, China

fYear

2014

fDate

12-14 Sept. 2014

Firstpage

619

Lastpage

623

Abstract

The i-vector representation and probabilistic linear discriminant analysis (PLDA) have shown state-of-the-art performance in many speaker verification systems. However, in real-world environments, additive and convolutive noise cause mismatches between training and recognition conditions, degrading the performance. In this paper, a fusion system that combines a multi-condition PLDA model and a mixture of SNR-dependent PLDA models is proposed to make the verification system noise robust. The SNR of test utterances is used to determine the best SNR-dependent PLDA model to score against the target-speaker´s i-vectors. The performance of the fusion system is demonstrated on NIST 2012 SRE. Results show that the SNR-dependent PLDA models can reduce EER and that the fusion system is more robust than the conventional i-vector/PLDA systems under noisy conditions. It is also found that the SNR-dependent PLDA models are insensitive to Z-norm parameters.

Keywords

sensor fusion; speaker recognition; EER; NIST 2012 SRE; SNR-dependent PLDA models fusion; Z-norm parameters; additive noise; convolutive noise; i-vector representation; noise robust speaker verification; probabilistic linear discriminant analysis; real-world environments; test utterances; Abstracts; Analytical models; Noise; Resilience; Robustness; Speech; Training; LDA; NIST 2012 SRE; Speaker verification; i-vectors; noise robustness; probabilistic;

fLanguage

English

Publisher

ieee

Conference_Titel

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location

Singapore

Type

conf

DOI

10.1109/ISCSLP.2014.6936593

Filename

6936593