مرکز منطقه ای اطلاع رساني علوم و فناوري - An empirical study of learning rates in deep neural networks for speech recognition

DocumentCode :

1686337

Title :

An empirical study of learning rates in deep neural networks for speech recognition

Author :

Senior, Alan ; Heigold, Georg ; Ranzato, Marc´Aurelio ; Ke Yang

Author_Institution :

Google Inc., New York, NY, USA

fYear :

2013

Firstpage :

6724

Lastpage :

6728

Abstract :

Recent deep neural network systems for large vocabulary speech recognition are trained with minibatch stochastic gradient descent but use a variety of learning rate scheduling schemes. We investigate several of these schemes, particularly AdaGrad. Based on our analysis of its limitations, we propose a new variant `AdaDec´ that decouples long-term learning-rate scheduling from per-parameter learning rate variation. AdaDec was found to result in higher frame accuracies than other methods. Overall, careful choice of learning rate schemes leads to faster convergence and lower word error rates.

Keywords :

gradient methods; neural nets; speech recognition; stochastic processes; AdaDec; AdaGrad; deep neural network; large vocabulary speech recognition; learning rate scheduling scheme; minibatch stochastic gradient descent; word error rates; Accuracy; Convergence; Neural networks; Speech; Speech recognition; Stochastic processes; Training; AdaDec; AdaGrad; Deep neural networks; Voice Search; large vocabulary speech recognition; learning rate;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6638963

Filename :

6638963

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1686337