• DocumentCode
    2980843
  • Title

    Combined models for topic spotting and topic-dependent language modeling

  • Author

    Bigi, Brigitte ; Mori, Renato Dee ; El-Béze, Marc ; Spriet, Thierry

  • Author_Institution
    Avignon Univ., France
  • fYear
    1997
  • fDate
    14-17 Dec 1997
  • Firstpage
    535
  • Lastpage
    542
  • Abstract
    A new statistical method for language modeling and spoken document classification is proposed. It is based on a mixture of topic dependent probabilities. Each topic dependent probability is in turn a mixture of n-gram probabilities and the probability of Kullback-Lieber (KL) distances between keyword unigrams and distribution obtained from the content of a cache memory. Experimental result on topic classification using a corpus of 60 Mword from the French newspaper Le Monde show the excellent performance of the cache memory and its complementary role in providing different statistics for the decision process
  • Keywords
    cache storage; natural languages; pattern classification; probability; speech recognition; French newspaper Le Monde; Kullback-Lieber distances; cache memory; combined models; decision process; keyword unigrams; n-gram probabilities; spoken document classification; statistical method; topic classification; topic dependent language modeling; topic dependent probabilities; topic spotting; Cache memory; History; Information retrieval; Natural languages; Probability; Statistical analysis; Statistical distributions; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
  • Conference_Location
    Santa Barbara, CA
  • Print_ISBN
    0-7803-3698-4
  • Type

    conf

  • DOI
    10.1109/ASRU.1997.659133
  • Filename
    659133