Title of article
A novel term weighting scheme based on discrimination power obtained from past retrieval results
Author/Authors
Sa-kwang Song، نويسنده , , Sung Hyon Myaeng، نويسنده ,
Issue Information
دوماهنامه با شماره پیاپی سال 2012
Pages
12
From page
919
To page
930
Abstract
Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term’s role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A term’s evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF*IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TF*IDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models.
Keywords
Term weighting , Evidential weight , Discrimination power , LANGUAGE MODEL , Probabilistic model
Journal title
Information Processing and Management
Serial Year
2012
Journal title
Information Processing and Management
Record number
1229289
Link To Document