DocumentCode
172454
Title
Hashdoop: A MapReduce framework for network anomaly detection
Author
Fontugne, Romain ; Mazel, Johan ; Fukuda, Kenji
Author_Institution
Nat. Inst. of Inf., Tokyo, Japan
fYear
2014
fDate
April 27 2014-May 2 2014
Firstpage
494
Lastpage
499
Abstract
Anomaly detection is essential for preventing network outages and maintaining the network resources available. However, to cope with the increasing growth of Internet traffic, network anomaly detectors are only exposed to sampled traffic, so harmful traffic may avoid detector examination. In this paper, we investigate the benefits of recent distributed computing approaches for real-time analysis of non-sampled Internet traffic. Focusing on the MapReduce model, our study uncovers a fundamental difficulty in order to detect network traffic anomalies by using Hadoop. Since MapReduce requires the dataset to be divided into small splits and anomaly detectors compute statistics from spatial and temporal traffic structures, special care should be taken when splitting traffic. We propose Hashdoop, a MapReduce framework that splits traffic with a hash function to preserve traffic structures and, hence, profits of distributed computing infrastructures to detect network anomalies. The benefits of Hashdoop are evaluated with two anomaly detectors and fifteen traces of Internet backbone traffic captured between 2001 and 2013. Using a 6-node cluster Hashdoop increased the throughput of the slowest detector with a speed-up of 15; thus, enabling real-time detection for the largest analyzed traces. Hashdoop also improved the overall detectors accuracy as splits emphasized anomalies by reducing the surrounding traffic.
Keywords
Internet; computer network security; cryptography; statistics; telecommunication traffic; Hashdoop; Internet backbone traffic; MapReduce framework; distributed computing approach; hash function; network anomaly detection; network anomaly detectors; network outage prevention; network resources; network traffic anomaly; nonsampled Internet traffic real-time analysis; spatial traffic structures; statistics; temporal traffic structures; Computational modeling; Conferences; Detectors; IP networks; Ports (Computers); Security; Throughput;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Communications Workshops (INFOCOM WKSHPS), 2014 IEEE Conference on
Conference_Location
Toronto, ON
Type
conf
DOI
10.1109/INFCOMW.2014.6849281
Filename
6849281
Link To Document