DocumentCode :
2193320
Title :
Co-occurrence based predictors for estimating query difficulty
Author :
Imran, Hazra ; Sharan, Aditi
Author_Institution :
Sch. of Comput. & Syst. Sci., Jawaharlal Nehru Univ., New Delhi, India
fYear :
2010
fDate :
13-13 Dec. 2010
Firstpage :
867
Lastpage :
874
Abstract :
Query difficulty prediction aims to identify, in advance, how reliably an information retrieval system will perform when faced with a particular user request. The prediction of query difficulty level is an interesting and important issue in Information Retrieval (IR) and is still an open research. In order to appreciate importance of query difficulty prediction we present an example., Information Retrieval (IR) is the Science of searching the relevant documents based on user´s need and a way towards discovering knowledge from text data. User´s needs are often expressed in terms of query. It has been observed that there is a word mismatch problem while matching user´s query to the documents. This is because users and authors of documents do not use same vocabulary. Query expansion/reformulation is a method to overcome such mismatch in terminology. Query expansion (QE) has become a well known technique that has been shown to improve average retrieval performance. However despite extensive research QE does not provide consistent gains over different query sets and collections. Therefore this technique has not been used in many operational systems as it may degrade performance of individual queries. A thorough investigation into robustness of query expansion is required in order to ensure reliability of query expansion for individual queries. It is well-known in the Information Retrieval community that methods such as query expansion can help ”easy” queries but are detrimental to ”hard” queries If the performance of queries can be predicted before retrieval then specific measures can be taken to improve the overall performance of the system. In this paper we do thorough investigations of various query difficulty predictors l and suggest two new query predictorsl based on co-occurrence of query terms. To evaluate the predictors, we have experimented on standard TREC collections. Our work is significant as it is a step towards judging re- - liability and robustness of query processing operations such as query expansion.
Keywords :
data mining; information retrieval systems; query formulation; query processing; relevance feedback; reliability; text analysis; word processing; co-occurrence based predictor; information retrieval system reliability; knowledge discovery; query difficulty estimation; query expansion; query reformulation; relevant document searching; standard TREC collection; text data; user query matching; word mismatch problem; Pre-retrieval query predictors; information retrieval; query difficulty;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9244-2
Electronic_ISBN :
978-0-7695-4257-7
Type :
conf
DOI :
10.1109/ICDMW.2010.81
Filename :
5693387
Link To Document :
بازگشت