Title :
Bayesian estimation methods for n-gram language model adaptation
Author :
Federico, Marcello
Author_Institution :
IRST, Trento, Italy
Abstract :
Stochastic n-gram language models have been successfully applied in continuous speech recognition for several years. Such language models provide many computational advantages but also require huge text corpora for parameter estimation. Moreover, the texts must exactly reflect, in a statistical sense, the user´s language. Estimating a language model on a sample that is not representative severely affects speech recognition performance. A solution to this problem is provided by the Bayesian learning framework. Beyond the classical estimates, a Bayes derived interpolation model is proposed. Empirical comparisons have been carried out on a 10,000-word radiological reporting domain. Results are provided in terms of perplexity and recognition accuracy
Keywords :
Bayes methods; computational linguistics; interpolation; learning (artificial intelligence); natural language interfaces; parameter estimation; probability; speech recognition; statistical analysis; stochastic processes; Bayesian estimation methods; Bayesian learning; continuous speech recognition; huge text corpora; interpolation model; n-gram language model adaptation; parameter estimation; radiological reporting domain; speech recognition performance; statistical; stochastic n-gram language models; Adaptation model; Bayesian methods; Computational modeling; Frequency; Hidden Markov models; Interpolation; Natural languages; Spectral analysis; Speech recognition; Stochastic processes;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607087