مرکز منطقه ای اطلاع رساني علوم و فناوري - Investigation of stochastic Hessian-Free optimization in Deep neural networks for speech recognition

DocumentCode :

134205

Title :

Investigation of stochastic Hessian-Free optimization in Deep neural networks for speech recognition

Author :

Zhao You ; Bo Xu

Author_Institution :

Interactive Digital Media Technol. Res. Center, Inst. of Autom., Beijing, China

fYear :

2014

fDate :

12-14 Sept. 2014

Firstpage :

450

Lastpage :

453

Abstract :

Effective training of Deep neural networks (DNNs) has very important significance for the DNNs based speech recognition systems. Stochastic gradient descent (SGD) is the most popular method for training DNNs. SGD often provides the solutions that are well adapt to generalization on held-out data. Recently, Hessian Free (HF) optimization have proved another optional algorithm for training DNNs. HF can be used for solving the pathological tasks. Stochastic Hessian Free (SHF) is a variation of HF, which can combine the generalization advantages of stochastic gradient descent (SGD) with second-order information from Hessian Free. This paper focus on investigating the SHF algorithm for DNN training. We conduct this algorithm on 100 hours Mandarin Chinese recorded speech recognition task. The first experiment shows that choosing proper size of gradient and curvature minibatch results in less training time and good performance. Next, it is observed that the performance of SHF does not depend on the initial parameters. Further more, experimental results shows that SHF performs with comparable results with SGD but better than traditional HF. Finally, we find that additional performance improvement is obtained with a dropout algorithm.

Keywords :

gradient methods; natural language processing; neural nets; speech recognition; stochastic programming; DNN training; HF optimization; Hessian free optimization; Mandarin Chinese recorded speech recognition task; SGD; SHF algorithm; deep neural networks; generalization advantages; optional algorithm; pathological tasks; second-order information; speech recognition system; stochastic Hessian free; stochastic Hessian-Free optimization; stochastic gradient descent; Error analysis; Hafnium; Neural networks; Optimization; Speech recognition; Stochastic processes; Training; Deep neural networks; Dropout; Speech recognition; Stochastic Hessian-Free optimization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location :

Singapore

Type :

conf

DOI :

10.1109/ISCSLP.2014.6936597

Filename :

6936597

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=134205