DocumentCode :
3597041
Title :
Adaptive focused crawler based on tunneling and link analysis
Author :
Zhang, Xiaoming ; Li, Zhoujun ; Hu, Chaojian
Author_Institution :
Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing
Volume :
3
fYear :
2009
Firstpage :
2225
Lastpage :
2230
Abstract :
At present, using focused crawler becomes a way to seek the needed information. The main characteristic of a focused web crawler is to select and retrieve only relevant web pages in each crawling process. In this paper, we propose a learnable algorithm that combines link analysis with web content in order to retrieve specific web documents, and it can predict the next URL through learning. The algorithm also uses an adaptive tunneling to overcome some of the limitations of normal focused crawlers. We apply three metrics to compare its efficiency with other well-known Web crawling techniques based.
Keywords :
Internet; information retrieval; information retrieval systems; Web content; Web document retrieval; adaptive focused Web crawler; learnable algorithm; link analysis; tunneling analysis; Algorithm design and analysis; Chaos; Computer science; Content based retrieval; Crawlers; Information analysis; Testing; Tunneling; Uniform resource locators; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Communication Technology, 2009. ICACT 2009. 11th International Conference on
ISSN :
1738-9445
Print_ISBN :
978-89-5519-138-7
Electronic_ISBN :
1738-9445
Type :
conf
Filename :
4809522
Link To Document :
بازگشت