مرکز منطقه ای اطلاع رساني علوم و فناوري - Rapid bayesian learning for recurrent neural network language model

DocumentCode :

134248

Title :

Rapid bayesian learning for recurrent neural network language model

Author :

Jen-Tzung Chien ; Yuan-Chu Ku ; Mou-Yue Huang

Author_Institution :

Dept. of Electr. & Comput. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan

fYear :

2014

fDate :

12-14 Sept. 2014

Firstpage :

Lastpage :

Abstract :

This paper presents Bayesian learning for recurrent neural network language model (RNN-LM). Our goal is to regularize the RNN-LM by compensating for the randomness of the estimated model parameters which is characterized by a Gaussian prior. This model is not only constructed by training the synaptic weight parameters according to the maximum a posteriori criterion but also regularized by estimating the Gaussian hyper-parameter through the type 2 maximum likelihood. However, a critical issue in Bayesian RNN-LM is the heavy computation of Hessian matrix which is formed as the sum of a large amount of outer-products of high-dimensional gradient vectors. We present a rapid approximation to reduce the redundancy due to the curse of dimensionality and speed up the calculation by summing up only the salient outer-products. Experiments on 1B-Word Benchmark, Penn Treebank and World Street Journal corpora show that rapid Bayesian RNN-LM consistently improves the perplexity and word error rate in comparison with standard RNN-LM.

Keywords :

Bayes methods; Hessian matrices; gradient methods; learning (artificial intelligence); parameter estimation; recurrent neural nets; speech recognition; 1B-Word Benchmark; Gaussian hyper-parameter estimation; Hessian matrix; Penn Treebank; World Street Journal corpora; high-dimensional gradient vector; maximum a posteriori criterion; model parameter estimation; rapid Bayesian RNN-LM; rapid Bayesian learning; rapid approximation; recurrent neural network language model; synaptic weight parameters; type 2 maximum likelihood; word error rate; Approximation methods; Bayes methods; Computational modeling; Recurrent neural networks; Speech recognition; Training; Vectors; Bayesian learning; Hessian matrix; Recurrent neural network language model; speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location :

Singapore

Type :

conf

DOI :

10.1109/ISCSLP.2014.6936640

Filename :

6936640

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=134248