Title :
Histogram equalization and noise masking for robust speech recognition
Author :
Zhang, Xueru ; Demuynck, Kris ; Van hamme, Hugo
Author_Institution :
Dept. of Electr. Eng., Katholieke Univ. Leuven, Leuven, Belgium
Abstract :
Mismatch between training and test conditions deteriorates the performance of speech recognizers. This paper investigates the combination of parametric histogram equalization (pHEQ) and noise masking to compensate for the mismatch caused by additive noise. The proposed front-end maps the distribution of the observed power spectrum vectors to a target distribution. The target distribution matches the distribution of the noise free training data except for an artificially reduced signal-to-noise ratio. Different power spectrum estimation algorithms are used to estimate the noise distribution as used internally by pHEQ more reliably under non-stationary noise conditions. The proposed front-end is evaluated on the Aurora4 database and shows a significant improvement w.r.t. mean-normalized Mel-frequency spectral coefficients. Moreover, the performance could be further improved if better estimates of the instantaneous noise power spectrum were available.
Keywords :
speech recognition; Aurora4 database; noise distribution; noise masking; pHEQ; parametric histogram equalization; power spectrum estimation algorithms; signal-to-noise ratio; speech recognition; w.r.t. mean-normalized mel-frequency spectral coefficients; Additive noise; Databases; Histograms; Noise reduction; Noise robustness; Signal to noise ratio; Spectral analysis; Speech recognition; Testing; Training data; Histogram equalization; noise masking; noise power spectrum; speech recognition;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495571