• DocumentCode
    2839366
  • Title

    Exploiting syntactic, semantic and lexical regularities in language modeling via directed Markov random fields

  • Author

    Shaojun Kang ; Wang, Shaomin ; Greiner, Russell ; Schuurmans, Dale ; Cheng, Li

  • fYear
    2004
  • fDate
    15-18 Dec. 2004
  • Firstpage
    305
  • Lastpage
    308
  • Abstract
    We present a directed Markov random field (MRF) model, that combines n-gram models, probabilistic context free grammars (PC FGs) and probabilistic latent semantic analysis (PLSA), for the purpose of statistical language modeling. The composite directed MRF model has a potentially exponential number of loops and becomes a context sensitive grammar, nevertheless we are able to estimate its parameters in cubic time using an efficient modified ME method, the generalized inside-outside algorithm, which extends the inside-outside algorithm to incorporate the effects of the n-gram and PLSA language models.
  • Keywords
    Markov processes; context-free grammars; linguistics; maximum likelihood estimation; natural languages; PC FG; PLSA; composite directed MRF model; context sensitive grammar; directed Markov random fields; generalized inside-outside algorithm; lexical regularities; maximum likelihood estimation; modified ME method; n-gram models; probabilistic context free grammars; probabilistic latent semantic analysis; semantic regularities; statistical language modeling; syntactic regularities; Context modeling; Humans; Information retrieval; Interpolation; Markov random fields; Maximum likelihood estimation; Natural languages; Probability; Speech recognition; Stochastic processes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing, 2004 International Symposium on
  • Print_ISBN
    0-7803-8678-7
  • Type

    conf

  • DOI
    10.1109/CHINSL.2004.1409647
  • Filename
    1409647