Title :
RASR/NN: The RWTH neural network toolkit for speech recognition
Author :
Wiesler, Simon ; Richard, Alexander ; Golik, Pavel ; Schluter, Ralf ; Ney, Hermann
Author_Institution :
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany
Abstract :
This paper describes the new release of RASR - the open source version of the well-proven speech recognition toolkit developed and used at RWTH Aachen University. The focus is put on the implementation of the NN module for training neural network acoustic models. We describe code design, configuration, and features of the NN module. The key feature is a high flexibility regarding the network topology, choice of activation functions, training criteria, and optimization algorithm, as well as a built-in support for efficient GPU computing. The evaluation of run-time performance and recognition accuracy is performed exemplary with a deep neural network as acoustic model in a hybrid NN/HMM system. The results show that RASR achieves a state-of-the-art performance on a real-world large vocabulary task, while offering a complete pipeline for building and applying large scale speech recognition systems.
Keywords :
graphics processing units; hidden Markov models; neural nets; optimisation; public domain software; speech recognition; telecommunication computing; telecommunication network topology; GPU computing; NN module; RASR-NN; RWTH Aachen University; RWTH neural network toolkit; activation functions; code configuration; code design; code features; hybrid NN-HMM system; network topology; open source version; optimization; run-time performance; speech recognition; training criteria; training neural network acoustic models; Acoustics; Graphics processing units; Hidden Markov models; Neural networks; Speech; Speech recognition; Training; GPU; RASR; acoustic modeling; neural networks; open source; speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854207