مرکز منطقه ای اطلاع رساني علوم و فناوري - Deep convolutional neural networks for LVCSR

DocumentCode :

1697164

Title :

Deep convolutional neural networks for LVCSR

Author :

Sainath, Tara N. ; Mohamed, Abdel-rahman ; Kingsbury, Brian ; Ramabhadran, Bhuvana

Author_Institution :

IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2013

Firstpage :

8614

Lastpage :

8618

Abstract :

Convolutional Neural Networks (CNNs) are an alternative type of neural network that can be used to reduce spectral variations and model spectral correlations which exist in signals. Since speech signals exhibit both of these properties, CNNs are a more effective model for speech compared to Deep Neural Networks (DNNs). In this paper, we explore applying CNNs to large vocabulary speech tasks. First, we determine the appropriate architecture to make CNNs effective compared to DNNs for LVCSR tasks. Specifically, we focus on how many convolutional layers are needed, what is the optimal number of hidden units, what is the best pooling strategy, and the best input feature type for CNNs. We then explore the behavior of neural network features extracted from CNNs on a variety of LVCSR tasks, comparing CNNs to DNNs and GMMs. We find that CNNs offer between a 13-30% relative improvement over GMMs, and a 4-12% relative improvement over DNNs, on a 400-hr Broadcast News and 300-hr Switchboard task.

Keywords :

correlation methods; neural nets; speech recognition; CNN; DNN; LVCSR tasks; broadcast news; convolutional layers; deep convolutional neural networks; hidden units; large vocabulary continuous speech recognition; pooling strategy; spectral correlations model; spectral variations reduction; speech signals; switchboard task; time 300 hr; time 400 hr; Acoustics; Convolution; Hidden Markov models; Neural networks; Speech; Speech recognition; Training; Neural Networks; Speech Recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6639347

Filename :

6639347

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1697164