• DocumentCode
    730336
  • Title

    Deep convolutional neural networks for acoustic modeling in low resource languages

  • Author

    Chan, William ; Lane, Ian

  • Author_Institution
    Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    2056
  • Lastpage
    2060
  • Abstract
    Convolutional Neural Networks (CNNs) have demonstrated powerful acoustic modelling capabilities due to their ability to account for structural locality in the feature space; and in recent works CNNs have been shown to often outperform fully connected Deep Neural Networks (DNNs) on TIMIT and LVCSR. In this paper, we perform a detailed empirical study of CNNs under the low resource condition, wherein we only have 10 hours of training data. We find a two dimensional convolutional structure performs the best, and emphasize the importance to consider time and spectrum in modelling acoustic patterns. We report detailed error rates across a wide variety of model structures and show CNNs consistently outperform fully connected DNNs for this task.
  • Keywords
    neural nets; speech recognition; CNN; DNN; LVCSR; TIMIT; automatic speech recognition; deep convolutional neural networks; low resource languages; powerful acoustic modelling capabilities; Acoustics; Convolution; Error analysis; Hidden Markov models; Neural networks; Speech recognition; Training; Automatic Speech Recognition; Convolutional Neural Networks; Deep Neural Networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178332
  • Filename
    7178332