Title :
Detection of Internet robots using a Bayesian approach
Author :
Suchacka, Grazyna ; Sobkow, Mariusz
Author_Institution :
Fac. of Math., Phys. & Comput. Sci., Opole Univ., Opole, Poland
Abstract :
A large part of Web traffic on e-commerce sites is generated not by human users but by Internet robots: search engine crawlers, shopping bots, hacking bots, etc. In practice, not all robots, especially the malicious ones, disclose their identities to a Web server and thus there is a need to develop methods for their detection and identification. This paper proposes the application of a Bayesian approach to robot detection based on characteristics of user sessions. The method is applied to the Web traffic from a real e-commerce site. Results show that the classification model based on the cluster analysis with the Ward´s method and the weighted Euclidean metric is very effective in robot detection, even obtaining accuracy of above 90%.
Keywords :
Bayes methods; Internet; Web sites; electronic commerce; invasive software; pattern classification; pattern clustering; telecommunication traffic; Bayesian approach; Internet robots detection; Internet robots identification; Ward method; Web server; Web traffic; classification model; cluster analysis; e-commerce sites; hacking bots; malicious robots; search engine crawlers; shopping bots; user sessions characteristics; weighted Euclidean metric; Bayes methods; Correlation; Euclidean distance; Internet; Robots; Testing; Bayesian approach; Bayesian statistics; Internet robot; Matlab; Web bot; Web mining; Web robot detection; Web server; Web traffic; cluster analysis; correlation analysis; data mining; e-commerce; log file analysis;
Conference_Titel :
Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on
Conference_Location :
Gdynia
Print_ISBN :
978-1-4799-8320-9
DOI :
10.1109/CYBConf.2015.7175961