DocumentCode :
353702
Title :
Putting it all together: language model combination
Author :
Goodman, Joshua T.
Author_Institution :
Speech Technol. Group, Microsoft Corp., Redmond, WA, USA
Volume :
3
fYear :
2000
fDate :
2000
Firstpage :
1647
Abstract :
In the past several years, a number of different language modeling improvements over simple trigram models have been found, including caching, higher-order n-grams, skipping, modified Kneser-Ney smoothing and clustering. While all of these techniques have been studied separately, they have rarely been studied in combination. We find some significant interactions, especially with smoothing techniques. The combination of all techniques leads to up to a 45% perplexity reduction over a Katz (1987) smoothed trigram model with no count cutoffs, the highest such perplexity reduction reported
Keywords :
linguistics; natural languages; nomograms; pattern clustering; smoothing methods; speech processing; caching; clustering; count cutoffs; higher-order n-grams; language model combination; modified Kneser-Ney smoothing; perplexity reduction; skipping; smoothing techniques; trigram models; History; Interpolation; Smoothing methods; Speech recognition; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
ISSN :
1520-6149
Print_ISBN :
0-7803-6293-4
Type :
conf
DOI :
10.1109/ICASSP.2000.862064
Filename :
862064
Link To Document :
بازگشت