• DocumentCode
    3437784
  • Title

    Objective evaluation of Spider Detection Techniques

  • Author

    Chunlong, Fan ; Zhouhua, Yu ; Lei, Xu

  • Author_Institution
    Sch. of Comput., Univ. of Shenyang Aerosp., Shenyang, China
  • fYear
    2010
  • fDate
    25-27 June 2010
  • Firstpage
    544
  • Lastpage
    548
  • Abstract
    Spider is a program for harvesting internet resources. Spiders Detection Techniques(SDT) are used to regulate and monitor behaviors of spiders visiting website. In this paper, an Evaluation Method based on Trap technique(EMT) is proposed to calculate the recall rate and precision rate of SDT. Without relying on manual analysis, it is more objective and more adaptive to the development of SDT. The principles of EMT bases on the statistical hypothesis that the distribution of users captured by trap obeys binomial distribution theory. The experiment of EMT indicates three conclusions: (1)EMT has the consistent conclusion with the manual analysis result. (2)EMT is little impacted by time span of analysis.(3)EMT is little impacted by the traps layout rate when it changes in ±10%.
  • Keywords
    Crawlers; Data privacy; Humans; Information retrieval; Internet; Monitoring; Robots; Search engines; Tin; Uniform resource locators; binomial distribution; evaluation; layout rate; spider detection; trap;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Wireless Communications, Networking and Information Security (WCNIS), 2010 IEEE International Conference on
  • Conference_Location
    Beijing, China
  • Print_ISBN
    978-1-4244-5850-9
  • Type

    conf

  • DOI
    10.1109/WCINS.2010.5541838
  • Filename
    5541838