DocumentCode :
2260437
Title :
Web mining based on VIPS in intention-based information retrieval
Author :
Zhang, Qiang ; Jiang, Xiaoxiao ; Sun, Jiashen
Author_Institution :
Beijing Univ. of Posts & Telecommun., Beijing, China
fYear :
2009
fDate :
24-27 Sept. 2009
Firstpage :
1
Lastpage :
5
Abstract :
This paper introduces a VIPS (Vision-based Page Segmentation) based Web mining method which aims to user intents based retrieval. It firstly grasps information from Web by making use of large search engines such as Baidu and so on, and then clusters the web pages basing on the intention-related features of Web text. The main algorithm is described in detail and experiments are designed to grasp the query in Chinese from Baidu and Ask search engines. The results prove that the VIPS based method can achieve significant improvement comparing with some previous work.
Keywords :
Internet; data mining; information retrieval; pattern clustering; search engines; text analysis; visual perception; Baidu-Ask search engine; Web page clustering; Web text mining; intention-based information retrieval; vision-based page segmentation; Clustering algorithms; Data mining; HTML; Information retrieval; Search engines; Sun; Tree data structures; Uniform resource locators; Web mining; Web pages; HTML structure; VIPS; information retrieval; web mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
Type :
conf
DOI :
10.1109/NLPKE.2009.5313791
Filename :
5313791
Link To Document :
بازگشت