• DocumentCode
    2382874
  • Title

    Language model adaptation using mixtures and an exponentially decaying cache

  • Author

    Clarkson, P.R. ; Robinson, A.J.

  • Author_Institution
    Dept. of Eng., Cambridge Univ., UK
  • Volume
    2
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    799
  • Abstract
    Presents two techniques for language model adaptation. The first is based on the use of mixtures of language models: the training text is partitioned according to topic, a language model is constructed for each component and, at recognition time, appropriate weightings are assigned to each component to model the observed style of language. The second technique is based on augmenting the standard trigram model with a cache component in which the words´ recurrence probabilities decay exponentially over time. Both techniques yield a significant reduction in perplexity over the baseline trigram language model when faced with a multi-domain test text, the mixture-based model giving a 24% reduction and the cache-based model giving a 14% reduction. The two techniques attack the problem of adaptation at different scales, and as a result can be used in parallel to give a total perplexity reduction of 30%
  • Keywords
    adaptive systems; cache storage; exponential distribution; natural languages; nomograms; speech recognition; exponentially decaying cache; language model adaptation; language style; mixture-based model; multi-domain test text; perplexity reduction; recognition; speech recognition; text topics; training text partitioning; trigram language model; weight assignment; word recurrence probabilities; Adaptation model; Integrated circuit modeling; Natural languages; Partial response channels; Predictive models; Speech; Testing; Text recognition; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.596049
  • Filename
    596049