Title :
Objective evaluation of Spider Detection Techniques
Author :
Chunlong, Fan ; Zhouhua, Yu ; Lei, Xu
Author_Institution :
Sch. of Comput., Univ. of Shenyang Aerosp., Shenyang, China
Abstract :
Spider is a program for harvesting internet resources. Spiders Detection Techniques(SDT) are used to regulate and monitor behaviors of spiders visiting website. In this paper, an Evaluation Method based on Trap technique(EMT) is proposed to calculate the recall rate and precision rate of SDT. Without relying on manual analysis, it is more objective and more adaptive to the development of SDT. The principles of EMT bases on the statistical hypothesis that the distribution of users captured by trap obeys binomial distribution theory. The experiment of EMT indicates three conclusions: (1)EMT has the consistent conclusion with the manual analysis result. (2)EMT is little impacted by time span of analysis.(3)EMT is little impacted by the traps layout rate when it changes in ±10%.
Keywords :
Crawlers; Data privacy; Humans; Information retrieval; Internet; Monitoring; Robots; Search engines; Tin; Uniform resource locators; binomial distribution; evaluation; layout rate; spider detection; trap;
Conference_Titel :
Wireless Communications, Networking and Information Security (WCNIS), 2010 IEEE International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
978-1-4244-5850-9
DOI :
10.1109/WCINS.2010.5541838