Title :
Robust feature selection and robust PCA for internet traffic anomaly detection
Author :
Pascoal, Cláudia ; de Oliveira, M. Rosário ; Valadas, Rui ; Filzmoser, Peter ; Salvador, Paulo ; Pacheco, António
Author_Institution :
CEMAT, UTL, Lisbon, Portugal
Abstract :
Robust statistics is a branch of statistics which includes statistical methods capable of dealing adequately with the presence of outliers. In this paper, we propose an anomaly detection method that combines a feature selection algorithm and an outlier detection method, which makes extensive use of robust statistics. Feature selection is based on a mutual information metric for which we have developed a robust estimator; it also includes a novel and automatic procedure for determining the number of relevant features. Outlier detection is based on robust Principal Component Analysis (PCA) which, opposite to classical PCA, is not sensitive to outliers and precludes the necessity of training using a reliably labeled dataset, a strong advantage from the operational point of view. To evaluate our method we designed a network scenario capable of producing a perfect ground-truth under real (but controlled) traffic conditions. Results show the significant improvements of our method over the corresponding classical ones. Moreover, despite being a largely overlooked issue in the context of anomaly detection, feature selection is found to be an important preprocessing step, allowing adaption to different network conditions and inducing significant performance gains.
Keywords :
Internet; principal component analysis; telecommunication traffic; Internet traffic anomaly detection; ground-truth; network scenario; outlier detection method; principal component analysis; robust PCA; robust feature selection; robust statistics; Estimation; Feature extraction; Gaussian distribution; Measurement; Principal component analysis; Robustness; Vectors;
Conference_Titel :
INFOCOM, 2012 Proceedings IEEE
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-4673-0773-4
DOI :
10.1109/INFCOM.2012.6195548