DocumentCode
417306
Title
Minimum Kullback-Leibler distance based multivariate Gaussian feature adaptation for distant-talking speech recognition
Author
Pan, Yue ; Waibel, Alex
Author_Institution
Interactive Syst. Labs., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
Multivariate Gaussian based speech compensation or mapping has been developed to reduce the mismatch between training and deployment conditions for robust speech recognition. The acoustic mapping procedure can be formulated as a feature space adaptation where a noisy input signal is transformed by a multivariate Gaussian network. We propose a novel algorithm to update the network parameters based on minimizing the Kullback-Leibler distance between the core recognizer´s acoustic model and transformed features. It is designed to achieve optimal overall system performance rather than MMSE on a specific feature domain. An online stochastic gradient descent learning rule is derived. We evaluate the performance of the new algorithm using a JRTk broadcast news system on a distance-talking speech corpus and compare its performance with that of previous MMSE based approaches. The experiments show the KL based approach is more effective for a large vocabulary continuous speech recognition (LVCSR) system.
Keywords
Gaussian processes; acoustic noise; gradient methods; hidden Markov models; learning (artificial intelligence); minimisation; random noise; speech recognition; HMM; Kullback-Leibler distance; LVCSR; MMSE; acoustic mapping; distant-talking speech recognition; feature space adaptation; large vocabulary continuous speech recognition; multivariate Gaussian feature adaptation; multivariate Gaussian network; robust speech recognition; speech compensation; speech mapping; stochastic gradient descent learning rule; Acoustic noise; Broadcasting; Gaussian noise; Robustness; Signal mapping; Speech analysis; Speech recognition; Stochastic processes; System performance; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326164
Filename
1326164
Link To Document