Title :
Exploring deep neural networks and deep autoencoders in reverberant speech recognition
Author :
Mimura, Masato ; Sakai, Shin´ichi ; Kawahara, Toshio
Author_Institution :
Acad. Center for Comput. & Media Studies, Kyoto Univ. Sakyo-ku, Kyoto, Japan
Abstract :
We propose an approach to reverberant speech recognition adopting deep learning in front end as well as back end of the system. At the front end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, and speech recognition is performed using a DNN-HMM acoustic models at the back end. The system was evaluated on simulated and real reverberant speech data sets. On average, the DNN-HMM system trained on the multi-condition training data outperformed the MLLR-adapted GMM-HMM system trained on the same data. The feature enhancement with the DAE contributed to the improvement of recognition accuracy especially in more adverse conditions. We also performed an unsupervised adaptation of the DNN-HMM models to the test data enhanced by the DAE and achieved improvements in word accuracies in all reverberation conditions of the test data.
Keywords :
Gaussian processes; acoustic signal processing; encoding; hidden Markov models; mixture models; neural nets; reverberation; speech recognition; unsupervised learning; DAE; DNN-HMM acoustic models; DNN-HMM system training; Gaussian mixture model; data testing; deep autoencoders; deep learning; deep-neural networks; hidden Markov models; multicondition training data; real reverberant speech data sets; recognition accuracy improvement; reverberant speech recognition; reverberation conditions; simulated reverberant speech data sets; speech feature parameter enhancement; system back end; system front end; unsupervised DNN-HMM models; word accuracy improvement; Accuracy; Hidden Markov models; Microphones; Neural networks; Speech; Speech recognition; Training; Deep Autoencoder (DAE); Deep Neural Networks (DNN); reverberant speech recognition;
Conference_Titel :
Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014 4th Joint Workshop on
Conference_Location :
Villers-les-Nancy
DOI :
10.1109/HSCMA.2014.6843279