Title :
Feature space Gaussianization
Author :
Saon, George ; Dharanipragada, Satya ; Povey, Dan
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
We propose a non-linear feature space transformation for speaker/environment adaptation which forces the individual dimensions of the acoustic data for every speaker to be Gaussian distributed. The transformation is given by the preimage under the Gaussian cumulative distribution function (CDF) of the empirical CDF on a per dimension basis. We show that, for a given dimension, this transformation achieves minimum divergence between the density function of the transformed adaptation data and the normal density with zero mean and unit variance. Experimental results on both small and large vocabulary tasks show consistent improvements over the application of linear adaptation transforms only.
Keywords :
Gaussian distribution; speech recognition; transforms; Gaussian cumulative distribution function; Gaussian distribution; acoustic data; density function; environment adaptation; feature space Gaussianization; linear adaptation transforms; nonlinear feature space transformation; speaker adaptation; speech recognition; unit variance; zero mean; Character generation; Decoding; Distribution functions; Gaussian processes; Hidden Markov models; Histograms; Loudspeakers; Nonlinear acoustics; Random variables; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1325989