• DocumentCode
    2371967
  • Title

    Machine learning for intrusion detection: Modeling the distribution shift

  • Author

    Farran, Bassam ; Saunders, Craig ; Niranjan, Mahesan

  • Author_Institution
    Sch. of Electron. & Comput. Sci., Univ. of Southampton, Southampton, UK
  • fYear
    2010
  • fDate
    Aug. 29 2010-Sept. 1 2010
  • Firstpage
    232
  • Lastpage
    237
  • Abstract
    This paper addresses two important issue that arise in formulating and solving computer intrusion detection as a machine learning problem, a topic that has attracted considerable attention in recent years including a community-wide competition using a common data set known as the KDD Cup ´991. The first of these problems we address is the size of the data set, 5×106 by 41 features, which makes conventional learning algorithms impractical. In previous work, we introduced a one-pass non-parametric classification technique called Voted Spheres, which carves up the input space into a series of overlapping hyperspheres. Training data seen within each hypersphere is used in a voting scheme during testing on unseen data. Secondly, we address the problem of distribution shift whereby the training and test data may be drawn from slightly different probability densities, while the conditional densities of class membership for a given datum remains the same. We adopt two recent techniques from the literature, density weighting and kernel mean matching, to enhance the Voted Spheres technique to deal with such distribution disparities. We demonstrate that substantial performance gains can be achieved using these techniques on the KDD cup data set.
  • Keywords
    learning (artificial intelligence); pattern matching; probability; security of data; density weighting; distribution shift; intrusion detection; kernel mean matching; machine learning; probability densities; voted spheres; Accuracy; Data models; Intrusion detection; Kernel; Logistics; Testing; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning for Signal Processing (MLSP), 2010 IEEE International Workshop on
  • Conference_Location
    Kittila
  • ISSN
    1551-2541
  • Print_ISBN
    978-1-4244-7875-0
  • Electronic_ISBN
    1551-2541
  • Type

    conf

  • DOI
    10.1109/MLSP.2010.5589161
  • Filename
    5589161