DocumentCode
178074
Title
Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition
Author
Xue Feng ; Yaodong Zhang ; Glass, James
Author_Institution
MIT Comput. Sci. & Artificial Intell. Lab., Cambridge, MA, USA
fYear
2014
fDate
4-9 May 2014
Firstpage
1759
Lastpage
1763
Abstract
Denoising autoencoders (DAs) have shown success in generating robust features for images, but there has been limited work in applying DAs for speech. In this paper we present a deep denoising autoencoder (DDA) framework that can produce robust speech features for noisy reverberant speech recognition. The DDA is first pre-trained as restricted Boltzmann machines (RBMs) in an unsupervised fashion. Then it is unrolled to autoencoders, and fine-tuned by corresponding clean speech features to learn a nonlinear mapping from noisy to clean features. Acoustic models are re-trained using the reconstructed features from the DDA, and speech recognition is performed. The proposed approach is evaluated on the CHiME-WSJ0 corpus, and shows a 16-25% absolute improvement on the recognition accuracy under various SNRs.
Keywords
Boltzmann machines; learning (artificial intelligence); reverberation; signal denoising; speech coding; speech recognition; CHiME-WSJ0 corpus; acoustic models; deep denoising autoencoders; noisy reverberant speech recognition; recognition accuracy; restricted Boltzmann machines; speech feature denoising; speech feature dereverberation; unsupervised learning; Decoding; Hidden Markov models; Noise measurement; Noise reduction; Robustness; Speech; Speech recognition; deep neural network; denoising autoencoder; feature denoising; robust speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6853900
Filename
6853900
Link To Document