DocumentCode
417164
Title
Feature space Gaussianization
Author
Saon, George ; Dharanipragada, Satya ; Povey, Dan
Author_Institution
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
We propose a non-linear feature space transformation for speaker/environment adaptation which forces the individual dimensions of the acoustic data for every speaker to be Gaussian distributed. The transformation is given by the preimage under the Gaussian cumulative distribution function (CDF) of the empirical CDF on a per dimension basis. We show that, for a given dimension, this transformation achieves minimum divergence between the density function of the transformed adaptation data and the normal density with zero mean and unit variance. Experimental results on both small and large vocabulary tasks show consistent improvements over the application of linear adaptation transforms only.
Keywords
Gaussian distribution; speech recognition; transforms; Gaussian cumulative distribution function; Gaussian distribution; acoustic data; density function; environment adaptation; feature space Gaussianization; linear adaptation transforms; nonlinear feature space transformation; speaker adaptation; speech recognition; unit variance; zero mean; Character generation; Decoding; Distribution functions; Gaussian processes; Hidden Markov models; Histograms; Loudspeakers; Nonlinear acoustics; Random variables; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1325989
Filename
1325989
Link To Document