DocumentCode
417248
Title
Corrective language modeling for large vocabulary ASR with the perceptron algorithm
Author
Roark, Brian ; Saraclar, Murat ; Collins, Michael
Author_Institution
AT&T Labs.-Res., USA
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
This paper investigates error-corrective language modeling using the perceptron algorithm on word lattices. The resulting model is encoded as a weighted finite-state automaton, and is used by intersecting the model with word lattices, making it simple and inexpensive to apply during decoding. We present results for various training scenarios for the Switchboard task, including using n-gram features of different orders, and performing n-best extraction versus using full word lattices. We demonstrate the importance of making the training conditions as close as possible to testing conditions. The best approach yields a 1.3 percent improvement in first pass accuracy, which translates to 0.5 percent improvement after other rescoring passes.
Keywords
error correction; feature extraction; finite automata; learning (artificial intelligence); perceptrons; speech recognition; vocabulary; Switchboard task; automatic speech recognition; error-corrective language modeling; large vocabulary ASR; n-best extraction; n-gram features; perceptron algorithm; training; weighted finite-state automaton; word lattices; Artificial intelligence; Automata; Automatic speech recognition; Costs; Decoding; Hidden Markov models; Laboratories; Lattices; Testing; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326094
Filename
1326094
Link To Document