Title :
Mining Related Queries from Query Logs Based on Linear Regression
Author :
Zhai, Haijun ; Zhang, Jin ; Wang, Xiaolei ; Zhang, Gang
Author_Institution :
Dept. of Comput. Sci. & Technol., Univ. of Sci. & Technol. of china, Hefei
Abstract :
In this paper a novel linear regression model is proposed to mine related queries from query logs. Three types of association relationships between queries are identified and leveraged in our model, which include query session co-occurence, URL-clicked sharing and text similarity. Previous work directly applies part of these relations, which may be largely affected by the noise in query logs, such as the sparsity of click-through data, query-session segmentation errors and noisy clicks. In this work we propose linear regression analysis to identify effective features. In this way, we can effectively deal with the noise issue. The experiments demonstrate that the features identified with linear regression analysis are very effective. Moreover, the performance of our proposed linear regression model outperforms existing methods.
Keywords :
data mining; query processing; regression analysis; URL-clicked sharing; click-through data; linear regression; noisy clicks; query logs; query mining; query session cooccurence; query-session segmentation error; text similarity; Computer science; Engineering management; Information management; Information retrieval; Information technology; Linear regression; Search engines; Seminars; Technology management; Web search; linear regression; query log; query session; related query;
Conference_Titel :
Future Information Technology and Management Engineering, 2008. FITME '08. International Seminar on
Conference_Location :
Leicestershire, United Kingdom
Print_ISBN :
978-0-7695-3480-0
DOI :
10.1109/FITME.2008.59