DocumentCode
1925855
Title
Web robot detection techniques based on statistics of their requested URL resources
Author
Guo, Weigang ; Ju, Shiguang ; Gu, Yi
Author_Institution
Inf. Center, Foshan Univ., Guangdong, China
Volume
1
fYear
2005
fDate
24-26 May 2005
Firstpage
302
Abstract
Following the widely use of search engines, the impact Web robots have on the Web sites should not be ignored. After analyzing the navigational patterns of Web robots from Web logs, two new algorithms are proposed. One is based on classification and statistics of requested URL resources, which classifies the URL resources into eight types and counts the number of session of the clients and number of visiting records with same type. And another is based on Web page member list, which constructs one member list for every Web page and one show table for every visitor. The experiment shows that the two new algorithms can detect the unknown robots and unfriendly robots who do not obey the standard for robot exclusion.
Keywords
Internet; Web sites; search engines; statistical analysis; URL resource; Web logs; Web page; Web robot detection; Web sites; search engine; statistical analysis; Algorithm design and analysis; Humans; Pattern analysis; Robotics and automation; Robots; Search engines; Statistical analysis; Statistics; Uniform resource locators; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Supported Cooperative Work in Design, 2005. Proceedings of the Ninth International Conference on
Print_ISBN
1-84600-002-5
Type
conf
DOI
10.1109/CSCWD.2005.194187
Filename
1504093
Link To Document