Feature space Gaussianization

Author

Saon, George ; Dharanipragada, Satya ; Povey, Dan

Author_Institution

IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA

Volume

1

fYear

2004

fDate

17-21 May 2004

Abstract

We propose a non-linear feature space transformation for speaker/environment adaptation which forces the individual dimensions of the acoustic data for every speaker to be Gaussian distributed. The transformation is given by the preimage under the Gaussian cumulative distribution function (CDF) of the empirical CDF on a per dimension basis. We show that, for a given dimension, this transformation achieves minimum divergence between the density function of the transformed adaptation data and the normal density with zero mean and unit variance. Experimental results on both small and large vocabulary tasks show consistent improvements over the application of linear adaptation transforms only.

Keywords

Gaussian distribution; speech recognition; transforms; Gaussian cumulative distribution function; Gaussian distribution; acoustic data; density function; environment adaptation; feature space Gaussianization; linear adaptation transforms; nonlinear feature space transformation; speaker adaptation; speech recognition; unit variance; zero mean; Character generation; Decoding; Distribution functions; Gaussian processes; Hidden Markov models; Histograms; Loudspeakers; Nonlinear acoustics; Random variables; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1325989

Filename

1325989