DocumentCode
3437784
Title
Objective evaluation of Spider Detection Techniques
Author
Chunlong, Fan ; Zhouhua, Yu ; Lei, Xu
Author_Institution
Sch. of Comput., Univ. of Shenyang Aerosp., Shenyang, China
fYear
2010
fDate
25-27 June 2010
Firstpage
544
Lastpage
548
Abstract
Spider is a program for harvesting internet resources. Spiders Detection Techniques(SDT) are used to regulate and monitor behaviors of spiders visiting website. In this paper, an Evaluation Method based on Trap technique(EMT) is proposed to calculate the recall rate and precision rate of SDT. Without relying on manual analysis, it is more objective and more adaptive to the development of SDT. The principles of EMT bases on the statistical hypothesis that the distribution of users captured by trap obeys binomial distribution theory. The experiment of EMT indicates three conclusions: (1)EMT has the consistent conclusion with the manual analysis result. (2)EMT is little impacted by time span of analysis.(3)EMT is little impacted by the traps layout rate when it changes in ±10%.
Keywords
Crawlers; Data privacy; Humans; Information retrieval; Internet; Monitoring; Robots; Search engines; Tin; Uniform resource locators; binomial distribution; evaluation; layout rate; spider detection; trap;
fLanguage
English
Publisher
ieee
Conference_Titel
Wireless Communications, Networking and Information Security (WCNIS), 2010 IEEE International Conference on
Conference_Location
Beijing, China
Print_ISBN
978-1-4244-5850-9
Type
conf
DOI
10.1109/WCINS.2010.5541838
Filename
5541838
Link To Document