Title :
An incremental update strategy in Deep Web
Author :
Li, Hui ; Guo, Mei ; Cai, Liang ; Yang, Yanwu
Author_Institution :
Coll. of Inf., BUCT, Beijing, China
Abstract :
An effective incremental web crawler maintains a local repository of web pages up to date. In this paper, we will introduce an approach to update pages in Deep Web. Unlike traditional studies which mainly concentrate on “important pages” or “refresh”, We classify pages with different ways of calculating the measure of their respective priorities, obtain the coefficient ratio between the derived categories through experimental statistics, and automatically adjust parameters to achieve the incremental update.
Keywords :
Internet; Web sites; classification; search engines; statistical analysis; Web pages; deep Web; experimental statistics; incremental Web crawler; incremental update strategy; pages classification; Books; Crawlers; Heuristic algorithms; Measurement; Navigation; Web pages; crawler; deep web; incremantal update; url categories;
Conference_Titel :
Natural Computation (ICNC), 2010 Sixth International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5958-2
DOI :
10.1109/ICNC.2010.5583330