Efficient representation and fast look-up of Maximum Entropy language models

Author

Cui, Jia ; Chen, Stanley ; Zhou, Bowen

Author_Institution

IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA

fYear

2011

fDate

11-15 Dec. 2011

Firstpage

231

Lastpage

236

Abstract

Word class information has long been proven useful in language modeling (LM). However, the improved performance of class-based LMs over word n-gram models generally comes at the cost of increased decoding complexity and model size. In this paper, we propose a modified version of the Maximum Entropy token-based language model of [1] that matches the performance of the best existing class-based models, but which is as fast for decoding as a word n-gram model. In addition, while it is easy to statically combine word n-gram models built on different corpora into a single word n-gram model for fast decoding, it is unknown how to statically combine class-based LMs effectively. Another contribution of this paper is to propose a novel combination method that retains the gain of class-based LMs over word n-gram models. Experimental results on several spoken language translation tasks show that our model performs significantly better than word n-gram models with comparable decoding speed and only a modest increase in model size.

Keywords

decoding; entropy; speech recognition; class-based LM; decoding speed; maximum entropy token-based language model; n-gram models; speech recognition; word class information; Computational modeling; Data models; Decoding; History; Interpolation; Training; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on

Conference_Location

Waikoloa, HI

Print_ISBN

978-1-4673-0365-1

Electronic_ISBN

978-1-4673-0366-8

Type

conf

DOI

10.1109/ASRU.2011.6163936

Filename

6163936