Title :
Data Base Investigation as a Ranking Problem
Author_Institution :
Digital Technol. & Biometrics Dept., Netherlands Forensic Inst., The Hague, Netherlands
Abstract :
When data mining for forensic investigations, we are typically confronted with strongly imbalanced classes. Moreover, the labels of the non-target or negative class are usually not confirmed. In other words, the non-positive objects are unlabeled. For these situations classification methods are not well suited. We propose to approach these problems as ranking problems. We apply several supervised learning methods, including recently developed methods that are specifically aimed at optimizing ranking performance. With a true investigation dataset, we show the improvement over the prior probabilities using the ranking approach. It turns out that some two-class classification methods perform competitively on ranking performance, while the true ranking methods do not stand out.
Keywords :
computer forensics; data mining; database management systems; learning (artificial intelligence); optimisation; pattern classification; probability; data mining; database investigation; forensic investigations; negative class labels; nontarget class labels; prior probabilities; ranking performance optimization; ranking problem; supervised learning methods; two-class classification methods; unlabeled nonpositive objects; Accuracy; Equations; Mathematical model; Noise; Optimization; Sociology; Support vector machines;
Conference_Titel :
Intelligence and Security Informatics Conference (EISIC), 2012 European
Conference_Location :
Odense
Print_ISBN :
978-1-4673-2358-1
DOI :
10.1109/EISIC.2012.44