Title :
Deep convolutional neural networks for acoustic modeling in low resource languages
Author :
Chan, William ; Lane, Ian
Author_Institution :
Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Convolutional Neural Networks (CNNs) have demonstrated powerful acoustic modelling capabilities due to their ability to account for structural locality in the feature space; and in recent works CNNs have been shown to often outperform fully connected Deep Neural Networks (DNNs) on TIMIT and LVCSR. In this paper, we perform a detailed empirical study of CNNs under the low resource condition, wherein we only have 10 hours of training data. We find a two dimensional convolutional structure performs the best, and emphasize the importance to consider time and spectrum in modelling acoustic patterns. We report detailed error rates across a wide variety of model structures and show CNNs consistently outperform fully connected DNNs for this task.
Keywords :
neural nets; speech recognition; CNN; DNN; LVCSR; TIMIT; automatic speech recognition; deep convolutional neural networks; low resource languages; powerful acoustic modelling capabilities; Acoustics; Convolution; Error analysis; Hidden Markov models; Neural networks; Speech recognition; Training; Automatic Speech Recognition; Convolutional Neural Networks; Deep Neural Networks;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178332