DocumentCode :
134205
Title :
Investigation of stochastic Hessian-Free optimization in Deep neural networks for speech recognition
Author :
Zhao You ; Bo Xu
Author_Institution :
Interactive Digital Media Technol. Res. Center, Inst. of Autom., Beijing, China
fYear :
2014
fDate :
12-14 Sept. 2014
Firstpage :
450
Lastpage :
453
Abstract :
Effective training of Deep neural networks (DNNs) has very important significance for the DNNs based speech recognition systems. Stochastic gradient descent (SGD) is the most popular method for training DNNs. SGD often provides the solutions that are well adapt to generalization on held-out data. Recently, Hessian Free (HF) optimization have proved another optional algorithm for training DNNs. HF can be used for solving the pathological tasks. Stochastic Hessian Free (SHF) is a variation of HF, which can combine the generalization advantages of stochastic gradient descent (SGD) with second-order information from Hessian Free. This paper focus on investigating the SHF algorithm for DNN training. We conduct this algorithm on 100 hours Mandarin Chinese recorded speech recognition task. The first experiment shows that choosing proper size of gradient and curvature minibatch results in less training time and good performance. Next, it is observed that the performance of SHF does not depend on the initial parameters. Further more, experimental results shows that SHF performs with comparable results with SGD but better than traditional HF. Finally, we find that additional performance improvement is obtained with a dropout algorithm.
Keywords :
gradient methods; natural language processing; neural nets; speech recognition; stochastic programming; DNN training; HF optimization; Hessian free optimization; Mandarin Chinese recorded speech recognition task; SGD; SHF algorithm; deep neural networks; generalization advantages; optional algorithm; pathological tasks; second-order information; speech recognition system; stochastic Hessian free; stochastic Hessian-Free optimization; stochastic gradient descent; Error analysis; Hafnium; Neural networks; Optimization; Speech recognition; Stochastic processes; Training; Deep neural networks; Dropout; Speech recognition; Stochastic Hessian-Free optimization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/ISCSLP.2014.6936597
Filename :
6936597
Link To Document :
بازگشت