Title :
Do we need a perfect ground-truth for benchmarking Internet traffic classifiers?
Author :
Rosario Oliveira, M. ; Neves, Joao ; Valadas, Rui ; Salvador, Paulo
Author_Institution :
Dept. de Mat., Univ. de Lisboa, Lisbon, Portugal
fDate :
April 26 2015-May 1 2015
Abstract :
The classification of Internet traffic using supervised or semi-supervised statistical learning techniques, both for anomaly detection and identification of Internet applications, has been impaired by difficulties in obtaining a reliable ground-truth, required both to train the classifier and to evaluate its performance. A perfect ground-truth is increasingly difficult, or sometimes impossible, to obtain due to the growing percentage of cyphered traffic, the sophistication of network attacks, and the constant updates of Internet applications. In this paper, we study the impact of the ground-truth on training the classifier and estimating its performance measures. We show both theoretically and through simulation that ground-truth imperfections can severely bias the performance estimates. We then propose a latent class model that overcomes this problem by combining estimates of several classifiers over the same dataset. The model is evaluated using a high-quality dataset that includes the most representative Internet applications and network attacks. The results show that our latent class model produces very good performance estimates under mild levels of ground-truth imperfection, and can thus be used to correctly benchmark Internet traffic classifiers when only an imperfect ground-truth is available.
Keywords :
Internet; learning (artificial intelligence); statistical analysis; telecommunication traffic; Internet traffic classification; ground-truth imperfection; latent class model; semisupervised statistical learning technique; Computers; Conferences; Estimation; IP networks; Internet; Standards; Training; Anomaly Detection; Identification of Internet Applications; Latent Class Models; Traffic Classification;
Conference_Titel :
Computer Communications (INFOCOM), 2015 IEEE Conference on
Conference_Location :
Kowloon
DOI :
10.1109/INFOCOM.2015.7218634