Title of article :
A context vector model for information retrieval
Author/Authors :
Holger Billhardt1، نويسنده , , *، نويسنده , , Daniel Borrajo2، نويسنده , , Victor Maojo3، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2002
Pages :
14
From page :
236
To page :
249
Abstract :
In the vector space model for information retrieval, term vectors are pair-wise orthogonal, that is, terms are assumed to be independent. It is well known that this assumption is too restrictive. In this article, we present our work on an indexing and retrieval method that, based on the vector space model, incorporates term dependencies and thus obtains semantically richer representations of documents. First, we generate term context vectors based on the co-occurrence of terms in the same documents. These vectors are used to calculate context vectors for documents. We present different techniques for estimating the dependencies among terms. We also define term weights that can be employed in the model. Experimental results on four text collections (MED, CRANFIELD, CISI, and CACM) show that the incorporation of term dependencies in the retrieval process performs statistically significantly better than the classical vector space model with IDF weights. We also show that the degree of semantic matching versus direct word matching that performs best varies on the four collections. We conclude that the model performs well for certain types of queries and, generally, for information tasks with high recall requirements. Therefore, we propose the use of the context vector model in combination with other, direct word-matching methods.
Journal title :
Journal of the American Society for Information Science and Technology
Serial Year :
2002
Journal title :
Journal of the American Society for Information Science and Technology
Record number :
993205
Link To Document :
بازگشت