Title of article :
Information Retrieval as Statistical Translation
Author/Authors :
Berger، Adam نويسنده , , Lafferty، John نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 1999
Pages :
-221
From page :
222
To page :
0
Abstract :
We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a statistical model of how a user might distill or "translate" a given document into a query. To assess the relevance of a document to a userʹs query, we estimate the probability that the query would have been generated as a translation of the document, and factor in the userʹs general preferences in the form of a prior distribution over documents. We propose a simple, well motivated model of the document-to-query translation process, and describe an algorithm for learning the parameters of this model in an unsupervised manner from a collection of documents. As we show, one can view this approach as a generalization and justification of the "language modeling" strategy recently proposed by Ponte and Croft. In a series of experiments on TREC data, a simple translation-based retrieval system performs well in comparison to conventional retrieval techniques. This prototype system only begins to tap the full potential of translation-based retrieval.
Keywords :
term co-occurrence , multidocument summary , Concept hierarchy , subsumption
Journal title :
SIGIR FORUM
Serial Year :
1999
Journal title :
SIGIR FORUM
Record number :
16686
Link To Document :
بازگشت