DocumentCode
1103789
Title
Estimation of probabilities in the language model of the IBM speech recognition system
Author
NÁdas, Arthur
Author_Institution
IBM T.J. Watson Research Center, Yorktown Heights, NY
Volume
32
Issue
4
fYear
1984
fDate
8/1/1984 12:00:00 AM
Firstpage
859
Lastpage
861
Abstract
The language model probabilities are estimated by an empirical Bayes approach in which a prior distribution for the unknown probabilities is itself estimated through a novel choice of data. The predictive power of the model thus fitted is compared by means of its experimental perplexity [1] to the model as fitted by the Jelinek-Mercer deleted estimator and as fitted by the Turing-Good formulas for probabilities of unseen or rarely seen events.
Keywords
Bayesian methods; Cities and towns; Helium; Natural languages; Power system modeling; Predictive models; Probability; Smoothing methods; Speech recognition; Vocabulary;
fLanguage
English
Journal_Title
Acoustics, Speech and Signal Processing, IEEE Transactions on
Publisher
ieee
ISSN
0096-3518
Type
jour
DOI
10.1109/TASSP.1984.1164378
Filename
1164378
Link To Document