• DocumentCode
    1691396
  • Title

    Multiframe deep neural networks for acoustic modeling

  • Author

    Vanhoucke, V. ; Devin, M. ; Heigold, Georg

  • Author_Institution
    Google, Inc., Mountain View, CA, USA
  • fYear
    2013
  • Firstpage
    7582
  • Lastpage
    7585
  • Abstract
    Deep neural networks have been shown to perform very well as acoustic models for automatic speech recognition. Compared to Gaussian mixtures however, they tend to be very expensive computationally, making them challenging to use in real-time applications. One key advantage of such neural networks is their ability to learn from very long observation windows going up to 400 ms. Given this very long temporal context, it is tempting to wonder whether one can run neural networks at a lower frame rate than the typical 10 ms, and whether there might be computational benefits to doing so. This paper describes a method of tying the neural network parameters over time which achieves comparable performance to the typical frame-synchronous model, while achieving up to a 4X reduction in the computational cost of the neural network activations.
  • Keywords
    Gaussian processes; neural nets; speech recognition; Gaussian mixtures; acoustic modeling; automatic speech recognition; computational cost; frame synchronous model; multiframe deep neural networks; Acoustics; Complexity theory; Computational modeling; Context; Error analysis; Hidden Markov models; Neural networks; acoustic modeling; deep neural networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639137
  • Filename
    6639137