Title :
Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions
Author :
Evangelista, Paul F. ; Embrechts, Mark J. ; Szymanski, Boleslaw K.
Author_Institution :
United States Military Acad., West Point
Abstract :
This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications inspired the naming of such problems as security classification problems (SCP). The area under the ROC curve (AUC) is the metric utilized to measure classifier performance, and in order to better understand AUC and ROC behavior, pseudo-ROC curves created from simulated data are introduced. ROC curves depend entirely upon the rankings created by classifiers. The rank distributions discussed in this paper display classifier performance in a novel form, and the behavior of these rank distributions provides insight into classifier fusion for the SCP. Rank distributions, which illustrate the probability of a particular rank containing a positive or negative instance, will be introduced and used to explain why synergistic classifier fusion occurs.
Keywords :
pattern classification; probability; sensor fusion; classifier fusion; data fusion; outlier detection; pseudo-ROC curves; rank distributions; security classification problems; Area measurement; Computer science; Data engineering; Data security; Displays; Electronic mail; Military computing; Modeling; Robustness; Systems engineering and theory;
Conference_Titel :
Neural Networks, 2006. IJCNN '06. International Joint Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
0-7803-9490-9
DOI :
10.1109/IJCNN.2006.246989