Title :
Exploiting syntactic, semantic and lexical regularities in language modeling via directed Markov random fields
Author :
Shaojun Kang ; Wang, Shaomin ; Greiner, Russell ; Schuurmans, Dale ; Cheng, Li
Abstract :
We present a directed Markov random field (MRF) model, that combines n-gram models, probabilistic context free grammars (PC FGs) and probabilistic latent semantic analysis (PLSA), for the purpose of statistical language modeling. The composite directed MRF model has a potentially exponential number of loops and becomes a context sensitive grammar, nevertheless we are able to estimate its parameters in cubic time using an efficient modified ME method, the generalized inside-outside algorithm, which extends the inside-outside algorithm to incorporate the effects of the n-gram and PLSA language models.
Keywords :
Markov processes; context-free grammars; linguistics; maximum likelihood estimation; natural languages; PC FG; PLSA; composite directed MRF model; context sensitive grammar; directed Markov random fields; generalized inside-outside algorithm; lexical regularities; maximum likelihood estimation; modified ME method; n-gram models; probabilistic context free grammars; probabilistic latent semantic analysis; semantic regularities; statistical language modeling; syntactic regularities; Context modeling; Humans; Information retrieval; Interpolation; Markov random fields; Maximum likelihood estimation; Natural languages; Probability; Speech recognition; Stochastic processes;
Conference_Titel :
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN :
0-7803-8678-7
DOI :
10.1109/CHINSL.2004.1409647