DocumentCode :
2288498
Title :
Offensive and defensive strategy of web crawler
Author :
Jiang Yuanshu ; Tang Wenzhong ; Guo Liyong
Author_Institution :
Key Lab. of Beijing Network Technol., Beihang Univ., Beijing, China
fYear :
2012
fDate :
6-8 July 2012
Firstpage :
355
Lastpage :
358
Abstract :
Crawling strategies of web crawler affect not only the quality of search engine, but also the working status of web server. Many web servers restrict the access of unknown crawler or the crawler with excessive visiting frequency. This paper analyzes these restrictions and proposes a strategy of proxy-based, login by simulating verification code automatically; give some guidance on the design of web crawler.
Keywords :
Internet; online front-ends; query processing; search engines; Web crawler; Web server; crawling strategy; defensive strategy; offensive strategy; search engine; verification code; Browsers; Crawlers; IP networks; Search engines; Time frequency analysis; Web servers; proxy server; recognition of verification code; web crawler;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Control and Automation (WCICA), 2012 10th World Congress on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-1397-1
Type :
conf
DOI :
10.1109/WCICA.2012.6357898
Filename :
6357898
Link To Document :
بازگشت