DocumentCode
164846
Title
Exploring deep neural networks and deep autoencoders in reverberant speech recognition
Author
Mimura, Masato ; Sakai, Shin´ichi ; Kawahara, Toshio
Author_Institution
Acad. Center for Comput. & Media Studies, Kyoto Univ. Sakyo-ku, Kyoto, Japan
fYear
2014
fDate
12-14 May 2014
Firstpage
197
Lastpage
201
Abstract
We propose an approach to reverberant speech recognition adopting deep learning in front end as well as back end of the system. At the front end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, and speech recognition is performed using a DNN-HMM acoustic models at the back end. The system was evaluated on simulated and real reverberant speech data sets. On average, the DNN-HMM system trained on the multi-condition training data outperformed the MLLR-adapted GMM-HMM system trained on the same data. The feature enhancement with the DAE contributed to the improvement of recognition accuracy especially in more adverse conditions. We also performed an unsupervised adaptation of the DNN-HMM models to the test data enhanced by the DAE and achieved improvements in word accuracies in all reverberation conditions of the test data.
Keywords
Gaussian processes; acoustic signal processing; encoding; hidden Markov models; mixture models; neural nets; reverberation; speech recognition; unsupervised learning; DAE; DNN-HMM acoustic models; DNN-HMM system training; Gaussian mixture model; data testing; deep autoencoders; deep learning; deep-neural networks; hidden Markov models; multicondition training data; real reverberant speech data sets; recognition accuracy improvement; reverberant speech recognition; reverberation conditions; simulated reverberant speech data sets; speech feature parameter enhancement; system back end; system front end; unsupervised DNN-HMM models; word accuracy improvement; Accuracy; Hidden Markov models; Microphones; Neural networks; Speech; Speech recognition; Training; Deep Autoencoder (DAE); Deep Neural Networks (DNN); reverberant speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014 4th Joint Workshop on
Conference_Location
Villers-les-Nancy
Type
conf
DOI
10.1109/HSCMA.2014.6843279
Filename
6843279
Link To Document