Title of article :
A proposed query-sensitive similarity measure for information retrieval
Author/Authors :
Zolghadri-Jahromi, M. shiraz university - Dept of Computer Science and Engineering, شيراز, ايران , Valizadeh, M.R islamic azad university of ilam, ايران
From page :
171
To page :
180
Abstract :
Document clustering has been widely used in information retrieval systems in order to improve the efficiency and also the effectiveness of ranked output systems using clustering hypothesis. Based on this hypothesis, documents relevant to a query tend to be highly similar in the context defined by the query. In this way, a pair of documents has an overall similarity (ignoring the query) and a specific similarity (similarity of a pair of documents given a query). A Query-Sensitive Similarity Measure (QSSM) is a mechanism to measure the similarity of two documents given a query. In this paper, in the first step, we identify the sources of information that may be used for this purpose. In the second step, we propose a QSSM based on these information sources. Finally, we propose a parametric QSSM that simultaneously makes use of the product and weighted sum to fuse the information from the identified sources. A genetic algorithm is used to learn the optimal values of parameters in this measure for a specific collection. The leave-one-out method is used to evaluate the proposed learning scheme. Our motivation for this is to see whether the learning scheme can perform significantly better than the measure proposed in the second step. Using several document collections, the performance of each measure is evaluated and the results are compared with other QSSMs proposed in the past research.
Keywords :
Query sensitive similarity measure , document clustering , genetic algorithm
Journal title :
Iranian Journal of Science and Technology :Transactions of Electrical Engineering
Journal title :
Iranian Journal of Science and Technology :Transactions of Electrical Engineering
Record number :
2596194
Link To Document :
بازگشت