DocumentCode
1761818
Title
Convolutional Neural Networks for Speech Recognition
Author
Abdel-Hamid, Ossama ; Mohamed, Abdel-rahman ; Hui Jiang ; Li Deng ; Penn, Gerald ; Dong Yu
Author_Institution
Dept. of Electr. Eng. & Comput. Sci., York Univ., Toronto, ON, Canada
Volume
22
Issue
10
fYear
2014
fDate
Oct. 2014
Firstpage
1533
Lastpage
1545
Abstract
Recently, the hybrid deep neural network (DNN)-hidden Markov model (HMM) has been shown to significantly improve speech recognition performance over the conventional Gaussian mixture model (GMM)-HMM. The performance improvement is partially attributed to the ability of the DNN to model complex correlations in speech features. In this paper, we show that further error rate reduction can be obtained by using convolutional neural networks (CNNs). We first present a concise description of the basic CNN and explain how it can be used for speech recognition. We further propose a limited-weight-sharing scheme that can better model speech features. The special structure such as local connectivity, weight sharing, and pooling in CNNs exhibits some degree of invariance to small shifts of speech features along the frequency axis, which is important to deal with speaker and environment variations. Experimental results show that CNNs reduce the error rate by 6%-10% compared with DNNs on the TIMIT phone recognition and the voice search large vocabulary speech recognition tasks.
Keywords
Gaussian processes; hidden Markov models; mixture models; neural nets; speech recognition; Gaussian mixture model; complex correlations; convolutional neural networks; hidden Markov model; hybrid deep neural network; limited-weight-sharing scheme; local connectivity; speech recognition; weight sharing; Convolution; Hidden Markov models; Neural networks; Speech; Speech recognition; Training; Vectors; Convolution; Limited Weight Sharing (LWS) scheme; convolutional neural networks; pooling;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2014.2339736
Filename
6857341
Link To Document