• DocumentCode
    3016882
  • Title

    Generic feature selection measure for botnet malware detection

  • Author

    Berg, P.E. ; Franke, Katrin ; Hai Thanh Nguyen

  • Author_Institution
    Dept. of Comput. Sci. & Media Technol., Gjovik Univ. Coll., Gjövik, Norway
  • fYear
    2012
  • fDate
    27-29 Nov. 2012
  • Firstpage
    711
  • Lastpage
    717
  • Abstract
    Feature selection for botnet malware detection is an important task. In this paper, we study the recently proposed Generic-Feature-Selection (GeFS) measure [18]. Since there is no benchmark dataset of botnet malware, we conduct experiments on the dataset that is generated by using public available tools. We utilize the static and dynamic approaches [24], [29], [12] to extract features from the generated dataset and to produce two separate feature sets. We analyze the statistical properties of these feature sets to provide more insights of their nature and quality. Subsequently we determine appropriate instances of the GeFS measure for feature selection. The GeFS measure was compared experimentally with two different methods regarding the feature selection capabilities in botnet malware detection: the genetic-algorithm-CFS and the best-first-CFS algorithms. We use five different classifiers to test the detection rates and false positive rates. The experiments show that we can remove 99.9% of irrelevant and redundant features from the datasets, while keeping or yielding even better classification performances. Moreover, the GeFS measure outperforms the genetic-algorithm-CFS and the best-first-CFS methods by removing much more redundant features.
  • Keywords
    feature extraction; invasive software; linear programming; statistical analysis; tree searching; GeFS measure; best-first-CFS algorithm; botnet malware detection; branch-and-bound; dynamic approaches; feature extraction; feature sets; generic feature selection measure; genetic-algorithm-CFS algorithm; mixed 0-1 linear programming; static approaches; statistical properties; Correlation; Data mining; Feature extraction; Intrusion detection; Libraries; Malware; Mutual information; botnets; branch and bound; feature selection; machine learning; malware analysis; mixed 0 – 1 linear programming;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications (ISDA), 2012 12th International Conference on
  • Conference_Location
    Kochi
  • ISSN
    2164-7143
  • Print_ISBN
    978-1-4673-5117-1
  • Type

    conf

  • DOI
    10.1109/ISDA.2012.6416624
  • Filename
    6416624