DocumentCode :
542189
Title :
Speaker identification using online, frame dependent, and diffusive variance adaptation
Author :
Axelrod, Scott
Author_Institution :
IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598, USA
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
In this paper we perform maximum likelihood adaptation of the variances of a Gaussian mixture model (GMM) based on a single acoustic data frame. We show that, in the case of prototype (and frame) dependent scaling of the variances, the adaptation amounts to a simple non-linear warping of the exponent of the Gaussian. We also introduce algorithms to perform “diffusive” variance adaptation, in which a positive constant is added to the model variance. When the constant is prototype independent (but possibly frame and coordinate dimension dependent), this modification of the GMM is equivalent to evolution of it by the diffusion equation of physics, which is guaranteed to increase entropy. Applied to the task of text-independent speaker identification on the LLHDB database, we report relative improvements of up to 28% reduction in speaker identification error rate compared to the unadapted model.
Keywords :
Adaptation model; Biological system modeling; Databases; Entropy; Frequency locked loops; Prototypes; Telephone sets;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743677
Filename :
5743677
Link To Document :
بازگشت