DocumentCode :
1957706
Title :
Using Normalized Compression Distance for Classifying File Fragments
Author :
Axelsson, Stefan
fYear :
2010
fDate :
15-18 Feb. 2010
Firstpage :
641
Lastpage :
646
Abstract :
We have applied the generalized and universal distance measure NCD-Normalized Compression Distance-to the problem of determining the types of file fragments via example. A corpus of files that can be redistributed to other researchers in the field was developed and the NCD algorithm using k-nearest-neighbor as a classification algorithm was applied to a random selection of file fragments. The experiment covered circa 2000 fragments from 17 different file types. While the overall accuracy of the n-valued classification only improved the prior probability of the class from approximately 6% to circa 50% overall, the classifier reached accuracies of 85%-100% for the most successful file types.
Keywords :
computer forensics; data compression; learning (artificial intelligence); pattern classification; file fragment classification; k-nearest neighbor algorithm; n-valued classification; normalized compression distance; Availability; Classification algorithms; Compression algorithms; Concatenated codes; Forensics; Machine learning; Machine learning algorithms; Noise shaping; Security; Shape measurement; APP-IDEN; FOR-DMIN;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Availability, Reliability, and Security, 2010. ARES '10 International Conference on
Conference_Location :
Krakow
Print_ISBN :
978-1-4244-5879-0
Type :
conf
DOI :
10.1109/ARES.2010.100
Filename :
5438024
Link To Document :
بازگشت