DocumentCode
612177
Title
Ranking outlier nodes in subspaces of attributed graphs
Author
Muller, E. ; Sanchez, P.I. ; Mulle, Y. ; Bohm, K.
Author_Institution
Karlsruhe Inst. of Technol. (KIT), Karlsruhe, Germany
fYear
2013
fDate
8-12 April 2013
Firstpage
216
Lastpage
222
Abstract
Outlier analysis is an important data mining task that aims to detect unexpected, rare, and suspicious objects. Outlier ranking enables enhanced outlier exploration, which assists the user-driven outlier analysis. It overcomes the binary detection of outliers vs. regular objects, which is not adequate for many applications. Traditional outlier ranking techniques focus on either vector data or on graph structures. However, many of today´s databases store both, multi dimensional numeric information and relations between objects in attributed graphs. An open challenge is how outlier ranking should cope with these different data types in a unified fashion. In this work, we propose a first approach for outlier ranking in subspaces of attributed graphs. We rank graph nodes according to their degree of deviation in both graph and attribute properties. We describe novel challenges induced by this combination of data types and propose subspace analysis as important method for outlier ranking on attributed graphs. Subspace clustering provides a selected subset of nodes and its relevant attributes in which deviation of nodes can be observed. Our graph outlier ranking (GOutRank) introduces scoring functions based on these selected subgraphs and subspaces. In addition to this technical contribution, we provide an attributed graph extracted from the Amazon marketplace. It includes a ground truth of real outliers labeled in a user experiment. In order to enable sustainable and comparable research results, we publish this database on our website1 as benchmark for future publications. Our experiments on this graph demonstrate the potential and the capabilities of outlier ranking in subspaces of attributed graphs.
Keywords
data mining; graph theory; pattern clustering; GOutRank; attributed graph subspaces; binary detection; data mining task; graph structures; outlier nodes ranking; subspace clustering; user-driven outlier analysis; Benchmark testing; Context; Data mining; Databases; Educational institutions; Eigenvalues and eigenfunctions; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
Conference_Location
Brisbane, QLD
Print_ISBN
978-1-4673-5303-8
Electronic_ISBN
978-1-4673-5302-1
Type
conf
DOI
10.1109/ICDEW.2013.6547453
Filename
6547453
Link To Document