مرکز منطقه ای اطلاع رساني علوم و فناوري - Maximum margin linear kernel optimization for speaker verification

DocumentCode :

3528027

Title :

Maximum margin linear kernel optimization for speaker verification

Author :

Omar, Mohamed Kamal ; Pelecanos, Jason ; Ramaswamy, Ganesh N.

Author_Institution :

IBM T. J. Watson Res. Center, Yorktown Heights, NY

fYear :

2009

fDate :

19-24 April 2009

Firstpage :

4037

Lastpage :

4040

Abstract :

This paper describes a novel approach for discriminative modeling and its application to automatic text-independent speaker verification. This approach maximizes the margin between the model scores for pairs of utterances belonging to the same speaker and for pairs of utterances belonging to different speakers. A low-dimensional linear kernel is estimated which maximizes this margin. This approach emphasizes speaker and for pairs of utterances belonging to different speakers. A low-dimensional linear kernel is estimated which maximizes this margin. This approach emphasizes features which features which have a better ability to discriminate between scores belonging to pairs of utterances of the same target speakers and those of different speakers. In this paper, we apply this approach to the NIST 2005 speaker verification task. Compared to the Gaussian mixture model (GMM) baseline system, a 17.7% relative improvement in the minimum detection cost function (DCF) and a 11.7% relative improvement in equal error rate (EER) are obtained. We achieve also a 5.7% relative improvement in EER and 2.3% relative improvement in DCF by using our approach on top of a nuisance attribute projection (NAP) compensated GMM based kernel baseline system.

Keywords :

error statistics; optimisation; speaker recognition; Gaussian mixture model; detection cost function; equal error rate; maximum margin linear kernel optimization; speaker verification; Covariance matrix; Kernel; Linear discriminant analysis; Loudspeakers; NIST; Speech analysis; Telephony; Testing; Training data; Vectors; GMM; Speaker verification; discriminative training; maximum margin; nuisance attribute projection;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location :

Taipei

ISSN :

1520-6149

Print_ISBN :

978-1-4244-2353-8

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2009.4960514

Filename :

4960514

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3528027