DocumentCode
730336
Title
Deep convolutional neural networks for acoustic modeling in low resource languages
Author
Chan, William ; Lane, Ian
Author_Institution
Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
2056
Lastpage
2060
Abstract
Convolutional Neural Networks (CNNs) have demonstrated powerful acoustic modelling capabilities due to their ability to account for structural locality in the feature space; and in recent works CNNs have been shown to often outperform fully connected Deep Neural Networks (DNNs) on TIMIT and LVCSR. In this paper, we perform a detailed empirical study of CNNs under the low resource condition, wherein we only have 10 hours of training data. We find a two dimensional convolutional structure performs the best, and emphasize the importance to consider time and spectrum in modelling acoustic patterns. We report detailed error rates across a wide variety of model structures and show CNNs consistently outperform fully connected DNNs for this task.
Keywords
neural nets; speech recognition; CNN; DNN; LVCSR; TIMIT; automatic speech recognition; deep convolutional neural networks; low resource languages; powerful acoustic modelling capabilities; Acoustics; Convolution; Error analysis; Hidden Markov models; Neural networks; Speech recognition; Training; Automatic Speech Recognition; Convolutional Neural Networks; Deep Neural Networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178332
Filename
7178332
Link To Document