DocumentCode :
2107050
Title :
Topic-Specific Crawling on the Web with Concept Context Graph Based on FCA
Author :
Peng, Qiangqing ; Du, Yajun ; Hai, Yufeng ; Chen, Shaoming ; Gao, Zhaoqiong
Author_Institution :
Sch. of Math. & Comput. Eng., Xihua Univ., Chengdu, China
fYear :
2009
fDate :
20-22 Sept. 2009
Firstpage :
1
Lastpage :
4
Abstract :
Topic-specific crawling is a method which can not crawl all the Web page, but only crawls the Web Pages which are related to users´ interests. The Web Pages which have high relevancy of the users´ interests should be crawled first. The major problem in focused crawling is how to assign proper credits to the unvisited pages the crawling will visit. In this paper, we propose an effective approach using concept context graph based on formal concept analysis to solve this problem. We build a concept lattice with the visited pages, and then use a method of combination of the term to construct our concept context graph based on the upper concept lattice. Our crawler can measure a page´s expected relevancy to a given topic and determine the order in which pages should be visited first. An experiment illustrates that the new method is an effective mechanism which have a considerable result.
Keywords :
Web sites; search engines; FCA; Web page; concept context graph; formal concept analysis; topic-specific crawling; Couplings; Crawlers; Indexing; Information analysis; Large scale integration; Large-scale systems; Lattices; Mathematics; Search engines; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Management and Service Science, 2009. MASS '09. International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-4638-4
Electronic_ISBN :
978-1-4244-4639-1
Type :
conf
DOI :
10.1109/ICMSS.2009.5302301
Filename :
5302301
Link To Document :
بازگشت