DocumentCode
1691396
Title
Multiframe deep neural networks for acoustic modeling
Author
Vanhoucke, V. ; Devin, M. ; Heigold, Georg
Author_Institution
Google, Inc., Mountain View, CA, USA
fYear
2013
Firstpage
7582
Lastpage
7585
Abstract
Deep neural networks have been shown to perform very well as acoustic models for automatic speech recognition. Compared to Gaussian mixtures however, they tend to be very expensive computationally, making them challenging to use in real-time applications. One key advantage of such neural networks is their ability to learn from very long observation windows going up to 400 ms. Given this very long temporal context, it is tempting to wonder whether one can run neural networks at a lower frame rate than the typical 10 ms, and whether there might be computational benefits to doing so. This paper describes a method of tying the neural network parameters over time which achieves comparable performance to the typical frame-synchronous model, while achieving up to a 4X reduction in the computational cost of the neural network activations.
Keywords
Gaussian processes; neural nets; speech recognition; Gaussian mixtures; acoustic modeling; automatic speech recognition; computational cost; frame synchronous model; multiframe deep neural networks; Acoustics; Complexity theory; Computational modeling; Context; Error analysis; Hidden Markov models; Neural networks; acoustic modeling; deep neural networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location
Vancouver, BC
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2013.6639137
Filename
6639137
Link To Document