مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker identification using online, frame dependent, and diffusive variance adaptation

DocumentCode :

542189

Title :

Speaker identification using online, frame dependent, and diffusive variance adaptation

Author :

Axelrod, Scott

Author_Institution :

IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598, USA

Volume :

fYear :

2002

fDate :

13-17 May 2002

Abstract :

In this paper we perform maximum likelihood adaptation of the variances of a Gaussian mixture model (GMM) based on a single acoustic data frame. We show that, in the case of prototype (and frame) dependent scaling of the variances, the adaptation amounts to a simple non-linear warping of the exponent of the Gaussian. We also introduce algorithms to perform “diffusive” variance adaptation, in which a positive constant is added to the model variance. When the constant is prototype independent (but possibly frame and coordinate dimension dependent), this modification of the GMM is equivalent to evolution of it by the diffusion equation of physics, which is guaranteed to increase entropy. Applied to the task of text-independent speaker identification on the LLHDB database, we report relative improvements of up to 28% reduction in speaker identification error rate compared to the unadapted model.

Keywords :

Adaptation model; Biological system modeling; Databases; Entropy; Frequency locked loops; Prototypes; Telephone sets;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location :

Orlando, FL, USA

ISSN :

1520-6149

Print_ISBN :

0-7803-7402-9

Type :

conf

DOI :

10.1109/ICASSP.2002.5743677

Filename :

5743677

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=542189