DocumentCode :
2979423
Title :
Analyzing and predicting language model improvements
Author :
Iyer, R. ; Ostendorf, M. ; Meteer, M.
Author_Institution :
Electr. & Comput. Eng. Dept., Boston Univ., MA, USA
fYear :
1997
fDate :
14-17 Dec 1997
Firstpage :
254
Lastpage :
261
Abstract :
Statistical n-gram language models are traditionally developed using perplexity as a measure of goodness. However, perplexity often demonstrates a poor correlation with recognition improvements, mainly because it fails to account for the acoustic confusability between words and for search errors in a recognizer. In this paper, we study alternatives to perplexity for predicting language model performance, including other global features as well as a new approach that predicts, with a high correlation (0.96), performance differences associated with localized changes in language models, given a recognition system. Experiments focus on the problem of augmenting in-domain Switchboard text with out-of-domain text from the Wall Street Journal and broadcast news that differ in both style and content from the in-domain data
Keywords :
natural languages; nomograms; performance index; speech recognition; statistics; Wall Street Journal; acoustic confusability; broadcast news; correlation; global features; goodness measure; in-domain Switchboard text; language model improvements; language model performance prediction; localized changes; out-of-domain text; performance differences; perplexity; search errors; speech recognition improvements; statistical n-gram language models; text content; text style; word confusion; Acoustic measurements; Broadcasting; Degradation; Error analysis; Internet; Natural languages; Predictive models; Probability; Speech recognition; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-7803-3698-4
Type :
conf
DOI :
10.1109/ASRU.1997.659013
Filename :
659013
Link To Document :
بازگشت