Title :
Bayesian Learning of N-Gram Statistical Language Modeling
Author :
Bai, Shuanhu ; Haizhou Li
Author_Institution :
Inst. for Infocomm Res.
Abstract :
The n-gram language model adaptation is typically formulated using deleted interpolation under the maximum likelihood estimation framework. This paper proposes a Bayesian learning framework for n-gram statistical language model training and adaptation. By introducing a Dirichlet conjugate prior to the n-gram parameters, we formulate the deleted interpolation under maximum a posterior criterion with a Bayesian learning procedure. We study the Bayesian learning formulation for n-gram and continuous n-gram language models. The experiments on North American News Text corpus have validated the effectiveness of the proposed algorithms
Keywords :
belief networks; interpolation; maximum likelihood estimation; natural languages; Bayesian learning; Dirichlet conjugate; deleted interpolation; maximum a posterior criterion; n-gram statistical language modeling; Acoustic waves; Adaptation model; Bayesian methods; Humans; Interpolation; Maximum likelihood estimation; Probability; Speech recognition; Testing; Vocabulary;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1660203