Title :
Putting it all together: language model combination
Author :
Goodman, Joshua T.
Author_Institution :
Speech Technol. Group, Microsoft Corp., Redmond, WA, USA
Abstract :
In the past several years, a number of different language modeling improvements over simple trigram models have been found, including caching, higher-order n-grams, skipping, modified Kneser-Ney smoothing and clustering. While all of these techniques have been studied separately, they have rarely been studied in combination. We find some significant interactions, especially with smoothing techniques. The combination of all techniques leads to up to a 45% perplexity reduction over a Katz (1987) smoothed trigram model with no count cutoffs, the highest such perplexity reduction reported
Keywords :
linguistics; natural languages; nomograms; pattern clustering; smoothing methods; speech processing; caching; clustering; count cutoffs; higher-order n-grams; language model combination; modified Kneser-Ney smoothing; perplexity reduction; skipping; smoothing techniques; trigram models; History; Interpolation; Smoothing methods; Speech recognition; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.862064