DocumentCode
2457934
Title
HiCS: High Contrast Subspaces for Density-Based Outlier Ranking
Author
Keller, Fabian ; Müller, Emmanuel ; Böhm, Klemens
Author_Institution
Inst. for Program Struct. & Data Organ., Karlsruhe Inst. of Technol. (KIT), Karlsruhe, Germany
fYear
2012
fDate
1-5 April 2012
Firstpage
1037
Lastpage
1048
Abstract
Outlier mining is a major task in data analysis. Outliers are objects that highly deviate from regular objects in their local neighborhood. Density-based outlier ranking methods score each object based on its degree of deviation. In many applications, these ranking methods degenerate to random listings due to low contrast between outliers and regular objects. Outliers do not show up in the scattered full space, they are hidden in multiple high contrast subspace projections of the data. Measuring the contrast of such subspaces for outlier rankings is an open research challenge. In this work, we propose a novel subspace search method that selects high contrast subspaces for density-based outlier ranking. It is designed as pre-processing step to outlier ranking algorithms. It searches for high contrast subspaces with a significant amount of conditional dependence among the subspace dimensions. With our approach, we propose a first measure for the contrast of subspaces. Thus, we enhance the quality of traditional outlier rankings by computing outlier scores in high contrast projections only. The evaluation on real and synthetic data shows that our approach outperforms traditional dimensionality reduction techniques, naive random projections as well as state-of-the-art subspace search techniques and provides enhanced quality for outlier ranking.
Keywords
data analysis; data mining; conditional dependence; data analysis; density-based outlier ranking; high contrast subspace projection; outlier mining; outlier ranking algorithm; quality enhancement; subspace dimension; subspace search method; Atmospheric measurements; Correlation; Data mining; Density measurement; Joints; Noise level; Probability density function;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2012 IEEE 28th International Conference on
Conference_Location
Washington, DC
ISSN
1063-6382
Print_ISBN
978-1-4673-0042-1
Type
conf
DOI
10.1109/ICDE.2012.88
Filename
6228154
Link To Document