Title :
Automatically Spotting Information-Rich Nodes in Graphs
Author :
He, Xiao ; Feng, Jing ; Plant, Claudia
Author_Institution :
Univ. of Munich, Munich, Germany
Abstract :
How to automatically spot the most interesting nodes in a large graph? Although graph mining has attracted lots of attention this question remains challenging for the following reasons: (1) It is not trivial to define what an interesting node is. In some cases, application-specific background knowledge may help but is not always available. Moreover, the notion of interestingness depends on the structure of the graph. In a sparse graph, a hub or a node at the center of a star-like or clique-like sub graph might be interesting. However, if the graph is densely connected, a remote node might be more interesting. (2) Most existing approaches to detect abnormal nodes involve parameters which are difficult to estimate. Our approach addresses both challenges by considering the problem from an information-theoretic perspective. An interesting node exhibits an outstanding link pattern which contributes much nonredundant information to the overall information content of the graph. Based on this basic idea, we define an intuitive similarity measure between nodes. Combined with metric embedding, this similarity measure allows spotting the most interesting nodes at first glance. Moreover, we propose Info-spot, a parameter free and efficient algorithm for fully automatic detection of interesting nodes. Guided by the objective of data compression, Info-spot greedily merges nodes exhibiting similar link patterns, while exceptional information-rich nodes naturally emerge. Extensive experiments demonstrate the benefits of our approach.
Keywords :
data mining; graph theory; abnormal node detection; automatic information-rich node spotting; automatic interesting node detection; clique-like subgraph; graph mining; info-spot algorithm; information-theoretic perspective; sparse graph; star-like subgraph; Complexity theory; Data mining; Encoding; Entropy; Merging; Runtime; Vectors; Automatically Detection; Information-theoretic; Outstanding Nodes; Rich Link Pattern;
Conference_Titel :
Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4673-0005-6
DOI :
10.1109/ICDMW.2011.37