DocumentCode :
1843030
Title :
Active learning with neural networks for intrusion detection
Author :
Seliya, Naeem ; Khoshgoftaar, Taghi M.
Author_Institution :
Comput. & Inf. Sci., Univ. of Michigan-Dearborn, Dearborn, MI, USA
fYear :
2010
fDate :
4-6 Aug. 2010
Firstpage :
49
Lastpage :
54
Abstract :
This paper presents a neural-network-based active learning procedure for computer network intrusion detection. Applying data mining and machine learning techniques to network intrusion detection often faces the problem of very large training dataset size. For example, the training dataset commonly used for the DARPA KDD-1999 offline intrusion detection project contained approximately five hundred thousand (10% sample of the original five million) observations, which were used to build intrusion detection classification models. The practical problems associated with such a large dataset include very long model training times, redundant information, and increased complexity in understanding the domain-specific data. We demonstrate that a simple active learning procedure can dramatically reduce the size of the training data, without significantly sacrificing the classification accuracy of the intrusion detection model. A case study of the DARPA KDD-1999 intrusion detection project is used in our work. The network traffic instances are classified into one of two categories - normal and attack. A comparison of the actively trained neural network model with a C4.5 decision tree indicated that the actively learned model had better generalization accuracy. In addition, the training data classification performance of the actively learned model was comparable to that of the C4.5 decision tree.
Keywords :
data mining; decision trees; learning (artificial intelligence); neural nets; security of data; C4.5 decision tree; DARPA KDD-1999; active learning; computer network intrusion detection; data mining; machine learning techniques; network traffic instances; neural networks; very large training dataset size; Artificial neural networks; Biological system modeling; Data models; Intrusion detection; Machine learning; Training; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Reuse and Integration (IRI), 2010 IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-8097-5
Type :
conf
DOI :
10.1109/IRI.2010.5558967
Filename :
5558967
Link To Document :
بازگشت