Title :
Bayesian recurrent neural network language model
Author :
Jen-Tzung Chien ; Yuan-Chu Ku
Author_Institution :
Dept. of Electr. & Comput. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Abstract :
This paper presents a Bayesian approach to construct the recurrent neural network language model (RNN-LM) for speech recognition. Our idea is to regularize the RNN-LM by compensating the uncertainty of the estimated model parameters which is represented by a Gaussian prior. The objective function in Bayesian RNN (BRNN) is formed as the regularized cross entropy error function. The regularized model is not only constructed by training the regularized parameters according to the maximum a posteriori criterion but also estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to Hessian matrix is developed by selecting a small set of salient outer-products and illustrated to be effective for BRNN-LM. BRNN-LM achieves sparser model than RNN-LM. Experiments on different corpora show promising improvement by applying BRNN-LM using different amount of training data.
Keywords :
Bayes methods; Gaussian processes; Hessian matrices; approximation theory; maximum likelihood estimation; recurrent neural nets; speech recognition; BRNN-LM; Bayesian recurrent neural network language model; Gaussian hyperparameter estimation; Gaussian prior; Hessian matrix rapid approximation; RNN-LM; estimated model parameter uncertainty compensation; marginal likelihood estimation; maximum a posteriori criterion; objective function; regularized cross entropy error function; salient outer-products; sparser model; speech recognition; training data; Approximation methods; Bayes methods; Computational modeling; History; Neurons; Recurrent neural networks; Training; Bayesian learning; Hessian matrix; Recurrent neural network; language model;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
DOI :
10.1109/SLT.2014.7078575