DocumentCode :
1066593
Title :
Minimum Rank Error Language Modeling
Author :
Chien, Jen-Tzung ; Wu, Meng-Sung
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan
Volume :
17
Issue :
2
fYear :
2009
Firstpage :
267
Lastpage :
276
Abstract :
Statistical language modeling has been successfully developed for speech recognition and information retrieval. The minimum classification error (MCE) training was undertaken to enhance speech recognition performance by minimizing the word error rate. This paper presents a new minimum rank error (MRE) algorithm for n-gram language model training. Rather than speech recognition, the proposed language models are estimated for information retrieval by considering the metric of average precision. However, the maximization of average precision is closely linked to minimizing the rank error or optimizing the order of the ranked documents. Accordingly, this paper calculates the rank error loss function from the misordering pairs of relevant and irrelevant documents in the rank list. The Bayes risk due to the expected rank loss is minimized to develop the Bayesian retrieval rule for ad-hoc information retrieval. Consequently, the discriminative training of language model is performed by integrating discrimination information from individual relevant documents relative to their corresponding irrelevant documents. Experimental results on TREC collections indicate that the proposed MRE language model improves the order of relevant documents, and degrades that of irrelevant documents. The MRE method achieves significantly higher average precision for test queries than the maximum likelihood and the MCE retrieval models.
Keywords :
Bayes methods; information retrieval; speech recognition; statistical analysis; Bayes risk; Bayesian retrieval rule; average precision; information retrieval; minimum classification error training; rank error loss function; speech recognition; statistical language modeling; Bayesian methods; Degradation; Error analysis; Information retrieval; Information systems; Maximum likelihood estimation; Natural languages; Optical losses; Speech recognition; Testing; Average precision; discriminative training; information retrieval; language model; rank error loss function;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2008.2008366
Filename :
4749457
Link To Document :
بازگشت