• DocumentCode
    417248
  • Title

    Corrective language modeling for large vocabulary ASR with the perceptron algorithm

  • Author

    Roark, Brian ; Saraclar, Murat ; Collins, Michael

  • Author_Institution
    AT&T Labs.-Res., USA
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    This paper investigates error-corrective language modeling using the perceptron algorithm on word lattices. The resulting model is encoded as a weighted finite-state automaton, and is used by intersecting the model with word lattices, making it simple and inexpensive to apply during decoding. We present results for various training scenarios for the Switchboard task, including using n-gram features of different orders, and performing n-best extraction versus using full word lattices. We demonstrate the importance of making the training conditions as close as possible to testing conditions. The best approach yields a 1.3 percent improvement in first pass accuracy, which translates to 0.5 percent improvement after other rescoring passes.
  • Keywords
    error correction; feature extraction; finite automata; learning (artificial intelligence); perceptrons; speech recognition; vocabulary; Switchboard task; automatic speech recognition; error-corrective language modeling; large vocabulary ASR; n-best extraction; n-gram features; perceptron algorithm; training; weighted finite-state automaton; word lattices; Artificial intelligence; Automata; Automatic speech recognition; Costs; Decoding; Hidden Markov models; Laboratories; Lattices; Testing; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326094
  • Filename
    1326094