Title :
Deep hierarchical bottleneck MRASTA features for LVCSR
Author :
Tuske, Zoltan ; Schluter, Ralf ; Ney, Hermann
Author_Institution :
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany
Abstract :
Hierarchical Multi Layer Perceptron (MLP) based long-term feature extraction is optimized for TANDEM connectionist large vocabulary continuous speech recognition (LVCSR) system within the QUAERO project. Training the bottleneck MLP on multi-resolutional RASTA filtered critical band energies, more than 20% relative word error rate (WER) reduction over standard MFCC system is observed after optimizing the number of target labels. Furthermore, introducing a deeper structure in the hierarchical bottleneck processing the relative gain increases to 25%. The final system based on deep bottleneck TANDEM features clearly outperforms the hybrid approach, even if the long-term features are also presented to the deep MLP acoustic model. The results are also verified on evaluation data of the year 2012, and about 20% relative WER improvement over classical cepstral system is measured even after speaker adaptive training.
Keywords :
acoustic signal processing; error statistics; feature extraction; filtering theory; multilayer perceptrons; signal resolution; speech recognition; vocabulary; LVCSR system; MLP based long-term feature extraction; QUAERO project; TANDEM connectionist; WER reduction; critical band energies; deep MLP acoustic model; deep bottleneck TANDEM; deep hierarchical bottleneck MRASTA features; filtering; hierarchical bottleneck processing; hierarchical multilayer perceptron; large vocabulary continuous speech recognition; multiresolutional RASTA; relative word error rate reduction; speaker adaptive training; standard MFCC system; Adaptation models; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Training; LVCSR; MLP; MRASTA; TANDEM; bottleneck; deep neural network; hierarchical; hybrid;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639013