Title :
Analyzing string format-based classifiers for botnet detection: GP and SVM
Author :
Haddadi, Fariba ; Zincir-Heywood, A. Nur
Author_Institution :
Comput. Sci., Dalhousie Univ., Halifax, NS, Canada
Abstract :
The domain name system (DNS) is an essential component of Internet. As it is expected to be used by all legitimate users and applications, generally there are less inspections, restrictions and filters on it. Botnets rely on this open component to accomplish their malicious operation. Therefore, to defeat the single point of failure and evade static blacklists and firewalls, they employ DNS-based methods to frequently generate new automatic domain names. Stateful-SBB, which is a form of genetic programming (GP), was previously designed and developed by the authors to detect these automatically generated domain names based on minimum a priori information which was shown efficient. In this paper, we compare Stateful-SBB against the String Subsequence Kernel (SSK) and SSK with Lambda Pruning (SSK-LP), which are based on support vector machines (SVM) and also use string format inputs. Analyzing the domain names that each of the classifiers chooses as a part of their solutions in the classification process, we notice that 50% to 63% of the Stateful-SBBs´ frequently selected points on the Pareto-front are also used by SSK and SSK-LP, respectively. By analyzing these common domain names, we identify some of the characteristics of the botnet domain names. Moreover, we introduce a pruned version of the Stateful-SBB that resulted in reducing the solution complexity by 83% with the same high accuracy.
Keywords :
Internet; data analysis; genetic algorithms; pattern classification; security of data; support vector machines; DNS-based method; GP; Internet; SSK with lambda pruning; SVM; Stateful-SBB; botnet detection; classification process; classifier analysis; domain name system; genetic programming; string format input; string format-based classifier; string subsequence kernel; support vector machines; Computers; Feature extraction; Internet; Kernel; Servers; Support vector machines; Training; botnet domain name detection; evolutionary computation; genetic programming;
Conference_Titel :
Evolutionary Computation (CEC), 2013 IEEE Congress on
Conference_Location :
Cancun
Print_ISBN :
978-1-4799-0453-2
Electronic_ISBN :
978-1-4799-0452-5
DOI :
10.1109/CEC.2013.6557886