DocumentCode
240910
Title
Classification of Partially Labeled Malicious Web Traffic in the Presence of Concept Drift
Author
Anastasovski, Goce ; Popstojanova, Katerina Goseva
Author_Institution
Alarm.com, Vienna, VA, USA
fYear
2014
fDate
June 30 2014-July 2 2014
Firstpage
130
Lastpage
139
Abstract
Attacks to Web systems have shown an increasing trend in the recent past. A contributing factor to this trend is the deployment of Web 2.0 technologies. While work related to characterization and classification of malicious Web traffic using supervised learning exists, little work has been done using semi-supervised learning with partially labeled data. In this paper an incremental semi-supervised algorithm (CSL-Stream) is used to classify malicious Web traffic to multiple classes, as well as to analyze the concept drift and concept evolution phenomena. The work is based on data collected in duration of nine months by a high-interaction honeypot running Web 2.0 applications. The results showed that on completely labeled data semi-supervised learning performed only slightly worse than the supervised learning algorithm. More importantly, multiclass classification of the partially labeled malicious Web traffic (i.e., 50% or 25% labeled sessions) was almost as good as the classification of completely labeled data.
Keywords
Internet; learning (artificial intelligence); pattern classification; security of data; CSL-Stream; Web 2.0 technologies; Web systems; concept drift; concept evolution; high-interaction honeypot; incremental semisupervised algorithm; malicious Web traffic; partially labeled data; partially labeled malicious Web traffic classification; semisupervised learning; supervised learning; Accuracy; Blogs; Electronic publishing; Information services; Internet; Measurement; Semisupervised learning; Concept drift; Concept evolution; Malicious Web traffic classification; Multiclass classification; Semi-supervised learning; Web 2.0 security;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Security and Reliability-Companion (SERE-C), 2014 IEEE Eighth International Conference on
Conference_Location
San Francisco, CA
Type
conf
DOI
10.1109/SERE-C.2014.31
Filename
6901650
Link To Document