Three probabilistic language models for a large-vocabulary speech recognizer

Author

Dumouchel, P. ; Gupta, V. ; Lennig, M. ; Mermelstein, P.

Author_Institution

INRS-Telecommun., Montreal, Que., Canada

fYear

1988

fDate

11-14 Apr 1988

Firstpage

513

Abstract

Relative performance is compared for three different language models applied to the linguistic decoding part of a 75000-word speech recognizer. These models are the trigram model, the tri-POS model (POS stands for parts of speech), and a smoothed trigram model with tied distributions for words three or more syllables long. The full trigram model gives the best performance but is most expensive in terms of data and storage requirements. The smoothed trigram and tri-POS models yield equivalent performance. For general text entry tasks, use of the tri-POS model is suggested since it is less sensitive to variations in the discourse domains. For applications specific to individual discourse domains, trigram models trained on data obtained from the target domain are recommended

Keywords

natural languages; speech recognition; discourse domains; large-vocabulary speech recognizer; linguistic decoding; parts of speech; probabilistic language models; smoothed trigram model; target domain; text entry; tri-POS model; trigram model; Acoustical engineering; Councils; Databases; Decoding; Frequency; Natural languages; Performance evaluation; Speech recognition; Testing; Text recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on

Conference_Location

New York, NY

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.1988.196632

Filename

196632