DocumentCode
3016882
Title
Generic feature selection measure for botnet malware detection
Author
Berg, P.E. ; Franke, Katrin ; Hai Thanh Nguyen
Author_Institution
Dept. of Comput. Sci. & Media Technol., Gjovik Univ. Coll., Gjövik, Norway
fYear
2012
fDate
27-29 Nov. 2012
Firstpage
711
Lastpage
717
Abstract
Feature selection for botnet malware detection is an important task. In this paper, we study the recently proposed Generic-Feature-Selection (GeFS) measure [18]. Since there is no benchmark dataset of botnet malware, we conduct experiments on the dataset that is generated by using public available tools. We utilize the static and dynamic approaches [24], [29], [12] to extract features from the generated dataset and to produce two separate feature sets. We analyze the statistical properties of these feature sets to provide more insights of their nature and quality. Subsequently we determine appropriate instances of the GeFS measure for feature selection. The GeFS measure was compared experimentally with two different methods regarding the feature selection capabilities in botnet malware detection: the genetic-algorithm-CFS and the best-first-CFS algorithms. We use five different classifiers to test the detection rates and false positive rates. The experiments show that we can remove 99.9% of irrelevant and redundant features from the datasets, while keeping or yielding even better classification performances. Moreover, the GeFS measure outperforms the genetic-algorithm-CFS and the best-first-CFS methods by removing much more redundant features.
Keywords
feature extraction; invasive software; linear programming; statistical analysis; tree searching; GeFS measure; best-first-CFS algorithm; botnet malware detection; branch-and-bound; dynamic approaches; feature extraction; feature sets; generic feature selection measure; genetic-algorithm-CFS algorithm; mixed 0-1 linear programming; static approaches; statistical properties; Correlation; Data mining; Feature extraction; Intrusion detection; Libraries; Malware; Mutual information; botnets; branch and bound; feature selection; machine learning; malware analysis; mixed 0 – 1 linear programming;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems Design and Applications (ISDA), 2012 12th International Conference on
Conference_Location
Kochi
ISSN
2164-7143
Print_ISBN
978-1-4673-5117-1
Type
conf
DOI
10.1109/ISDA.2012.6416624
Filename
6416624
Link To Document