• DocumentCode
    2620846
  • Title

    Language Model Based on Word Order Sensitive Matrix Representation in Latent Semantic Analysis for Speech Recognition

  • Author

    Naptali, Welly ; Tsuchiya, Masatoshi ; Nakagawa, Seiichi

  • Author_Institution
    Dept. of Inf. & Comput. Sci., Toyohashi Univ. of Technol., Toyohashi, Japan
  • Volume
    7
  • fYear
    2009
  • fDate
    March 31 2009-April 2 2009
  • Firstpage
    252
  • Lastpage
    256
  • Abstract
    This paper investigates matrix representation in latent semantic analysis (LSA) framework for a language model. In LSA, word-document matrix is usually used to represent a corpus. However, this matrix ignores word order in the sentence. We propose several word co-occurrence matrices that keep word order to use in LSA. To support this matrix, we define a context dependent class (CDC) language model, which distinguishes classes according to their context in the sentences. Experiments on Wall Street Journal (WSJ) corpus show that the proposed method achieves better performance than the original LSA with word-document matrix.
  • Keywords
    simulation languages; speech recognition; context dependent class language; language model; latent semantic analysis; speech recognition; wall street journal corpus; word order sensitive matrix representation; Computer science; Context modeling; Equations; History; Information analysis; Natural languages; Neural networks; Power system modeling; Speech analysis; Speech recognition; Language model; Latent semantic analysis; Word co-occurrence matrix;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Engineering, 2009 WRI World Congress on
  • Conference_Location
    Los Angeles, CA
  • Print_ISBN
    978-0-7695-3507-4
  • Type

    conf

  • DOI
    10.1109/CSIE.2009.353
  • Filename
    5170320