مرکز منطقه ای اطلاع رساني علوم و فناوري - Construction of discriminative Kernels from known and unknown non-targets for PLDA-SVM scoring

DocumentCode :

179006

Title :

Construction of discriminative Kernels from known and unknown non-targets for PLDA-SVM scoring

Author :

Wei Rao ; Man-Wai Mak

Author_Institution :

Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Hong Kong, China

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

4012

Lastpage :

4016

Abstract :

Conventional PLDA scoring in i-vector speaker verification involves the i-vectors of target speakers and claimants only. We have previously demonstrated that better performance can be achieved by incorporating the information of background speakers in the scoring process via speaker-dependent SVMs. This is achieved by defining a PLDA score space with dimension equal to the number of training i-vectors for each target speaker. The new protocol in NIST 2012 SRE permits systems to use the information of other target-speakers (called known non-targets) in each verification trial. In this paper, we exploit this new protocol to enhance the performance of PLDA-SVM scoring by using the score vectors of both known and unknown non-targets as the impostor class data to train the speaker-dependent SVMs. Because some target speakers have one enrollment utterance only, which results in severe imbalance in the speaker- and impostor-class data for SVM training. This paper shows that if the enrollment utterance is sufficiently long, a number of target-speaker i-vectors can be generated by an utterance partitioning and resampling technique, resulting in much better scoring SVMs. Results on NIST 2012 SRE demonstrate the advantages of pooling the known and unknown non-targets for training the SVMs and that the resampling techniques can help the SVM training algorithm to find better decision boundaries for those speakers with only a small number of enrollment utterances.

Keywords :

signal sampling; speaker recognition; support vector machines; vectors; NIST 2012 SRE; PLDA-SVM scoring; SVM training; background speakers; discriminative kernel construction; enrollment utterances; i-vector speaker verification; impostor-class data; performance enhancement; protocol; resampling technique; score vectors; speaker-class data; speaker-dependent SVMs; target-speaker i-vectors; utterance partitioning; Kernel; NIST; Noise; Speech; Support vector machines; Training; Vectors; I-vectors; NIST 2012 SRE; empirical kernel maps; likelihood ratio kernels; probabilistic linear discriminant analysis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6854355

Filename :

6854355

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=179006