Title :
Cross Domain Random Walk for Query Intent Pattern Mining from Search Engine Log
Author :
Gu, Siyu ; Yan, Jun ; Ji, Lei ; Yan, Shuicheng ; Huang, Junshi ; Liu, Ning ; Chen, Ying ; Chen, Zheng
Author_Institution :
Dept. of CS, Beijing Inst. of Technol., Beijing, China
Abstract :
Understanding search intents of users through their condensed short queries has attracted much attention both in academia and industry. The search intents of users are generally assumed to be associated with various query patterns, such as "MobileName price", where "MobileName" could be any named entity of mobile phone model and this pattern indicates that the user intends to buy a mobile phone. However, discovering the query intent patterns for general search is challenging mainly due to the difficulty in collecting sufficient training data for learning query patterns across a large number of searchable domains. In this work, we propose Cross Domain Random Walk (CDRW) algorithm, which is semi-supervised, to discover the query intent patterns across different domains from search engine click-through log data. Starting with some manually tagged seed queries in one or more independent domains, CDRW takes the query patterns as bridge and propagates the transition probability across domains to collect the query intent patterns among different domains based on the assumption that "users who have similar intent in different but similar domains will have high probability to share similar query patterns across domains". Different from classical random walk algorithms, CDRW walks across different domains to disseminate the shared knowledge in a transfer learning manner. Extensive experiment results on real log data of a commercial search engine well validate the effectiveness and efficiency of the proposed algorithm.
Keywords :
data mining; learning (artificial intelligence); probability; query processing; random processes; search engines; MobileName price; cross domain random walk algorithm; mobile phone model; query intent pattern mining; search engine click-through log data; semisupervised learning; transfer learning; transition probability; Algorithm design and analysis; Bridges; Data mining; Manuals; Mobile handsets; Search engines; Training data; query intent pattern; random walk; semi-supervised learning; transfer learning;
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
Print_ISBN :
978-1-4577-2075-8
DOI :
10.1109/ICDM.2011.44