Investigating linguistic knowledge in a maximum entropy token-based language model

Author

Cui, Jia ; Su, Yi ; Hall, Keith ; Jelinek, Frederick

Author_Institution

Johns Hopkins Univ., Baltimore

fYear

2007

fDate

9-13 Dec. 2007

Firstpage

171

Lastpage

176

Abstract

We present a novel language model capable of incorporating various types of linguistic information as encoded in the form of a token, a (word, label)-tuple. Using tokens as hidden states, our model is effectively a hidden Markov model (HMM) producing sequences of words with trivial output distributions. The transition probabilities, however, are computed using a maximum entropy model to take advantage of potentially overlapping features. We investigated different types of labels with a wide range of linguistic implications. These models outperform Kneser-Ney smoothed n-gram models both in terms of perplexity on standard datasets and in terms of word error rate for a large vocabulary speech recognition system.

Keywords

hidden Markov models; linguistics; maximum entropy methods; speech recognition; Kneser-Ney smoothed n-gram models; hidden Markov model; linguistic knowledge; maximum entropy token-based language model; speech recognition system; token encoding; Context modeling; Entropy; Error analysis; Hidden Markov models; Natural languages; Predictive models; Speech processing; Speech recognition; Testing; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on

Conference_Location

Kyoto

Print_ISBN

978-1-4244-1746-9

Electronic_ISBN

978-1-4244-1746-9

Type

conf

DOI

10.1109/ASRU.2007.4430104

Filename

4430104