• DocumentCode
    336822
  • Title

    A class-based language model for large-vocabulary speech recognition extracted from part-of-speech statistics

  • Author

    Samuelsson, Christer ; Reichl, Wolfgang

  • Author_Institution
    AT&T Bell Labs., Murray Hill, NJ, USA
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    537
  • Abstract
    A novel approach is presented to class-based language modeling based on part-of-speech statistics. It uses a deterministic word-to-class mapping, which handles words with alternative part-of-speech assignments through the use of ambiguity classes. The predictive power of word-based language models and the generalization capability of class-based language models are combined using both linear interpolation and word-to-class backoff, and both methods are evaluated. Since each word belongs to one precisely ambiguity class, an exact word-to-class backoff model can easily be constructed. Empirical evaluations on large-vocabulary speech-recognition tasks show perplexity improvements and significant reductions in word error-rate
  • Keywords
    error statistics; interpolation; natural languages; speech recognition; ambiguity classes; class-based language model; deterministic word-to-class mapping; large-vocabulary speech recognition; linear interpolation; part-of-speech assignments; part-of-speech statistics; perplexity improvements; word error-rate reduction; word-based language models; word-to-class backoff; Decoding; Lattices; Natural languages; Predictive models; Speech recognition; Springs; Statistics; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758181
  • Filename
    758181