DocumentCode :
3765359
Title :
Are We Missing Labels? A Study of the Availability of Ground-Truth in Network Security Research
Author :
Sebastian Abt;Harald Baier
Author_Institution :
da/sec - Biometrics &
fYear :
2014
Firstpage :
40
Lastpage :
55
Abstract :
Network security is a long-lasting field of research constantly encountering new challenges. Inherently, research in this field is highly data-driven. Specifically, many approaches employ a supervised machine learning approach requiring labelled input data. While different publicly available data sets exist, labelling information is sparse. In order to understand how our community deals with this lack of labels, we perform a systematic study of network security research accepted at top IT security conferences in 2009-2013. Our analysis reveals that 70% of the papers reviewed rely on manually compiled data sets. Furthermore, only 10% of the studied papers release the data sets after compilation. This manifests that our community is facing a missing labelled data problem. In order to be able to address this problem, we give a definition and discuss crucial characteristics of the problem. Furthermore, we reflect and discuss roads towards overcoming this problem by establishing ground-truth and fostering data sharing.
Keywords :
"Security","Communication networks","Internet","IP networks","Payloads","Labeling","Biometrics (access control)"
Publisher :
ieee
Conference_Titel :
Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS), 2014 Third International Workshop on
Print_ISBN :
978-1-4799-8308-7
Type :
conf
DOI :
10.1109/BADGERS.2014.11
Filename :
7446034
Link To Document :
بازگشت