• DocumentCode
    1925855
  • Title

    Web robot detection techniques based on statistics of their requested URL resources

  • Author

    Guo, Weigang ; Ju, Shiguang ; Gu, Yi

  • Author_Institution
    Inf. Center, Foshan Univ., Guangdong, China
  • Volume
    1
  • fYear
    2005
  • fDate
    24-26 May 2005
  • Firstpage
    302
  • Abstract
    Following the widely use of search engines, the impact Web robots have on the Web sites should not be ignored. After analyzing the navigational patterns of Web robots from Web logs, two new algorithms are proposed. One is based on classification and statistics of requested URL resources, which classifies the URL resources into eight types and counts the number of session of the clients and number of visiting records with same type. And another is based on Web page member list, which constructs one member list for every Web page and one show table for every visitor. The experiment shows that the two new algorithms can detect the unknown robots and unfriendly robots who do not obey the standard for robot exclusion.
  • Keywords
    Internet; Web sites; search engines; statistical analysis; URL resource; Web logs; Web page; Web robot detection; Web sites; search engine; statistical analysis; Algorithm design and analysis; Humans; Pattern analysis; Robotics and automation; Robots; Search engines; Statistical analysis; Statistics; Uniform resource locators; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Supported Cooperative Work in Design, 2005. Proceedings of the Ninth International Conference on
  • Print_ISBN
    1-84600-002-5
  • Type

    conf

  • DOI
    10.1109/CSCWD.2005.194187
  • Filename
    1504093