DocumentCode :
2840436
Title :
Clustering Streaming Graphs
Author :
Eldawy, Ahmed ; Khandekar, Rohit ; Wu, Kun-Lung
Author_Institution :
Univ. of Minnesota, Minneapolis, MN, USA
fYear :
2012
fDate :
18-21 June 2012
Firstpage :
466
Lastpage :
475
Abstract :
In this paper, we propose techniques for clustering large-scale "streaming" graphs where the updates to a graph are given in form of a stream of vertex or edge additions and deletions. Our algorithm handles such updates in an online and incremental manner and it can be easily parallel zed. Several previous graph clustering algorithms fall short of handling massive and streaming graphs because they are centralized, they need to know the entire graph beforehand and are not incremental, or they incur an excessive computational overhead. Our algorithm\´s fundamental building block is called graph reservoir sampling. We maintain a reservoir sample of the edges as the graph changes while satisfying certain desired properties like bounding number of clusters or cluster-sizes. We then declare connected components in the sampled sub graph as clusters of the original graph. Our experiments on real graphs show that our approach not only yields clusterings with very good quality, but also obtains orders of magnitude higher throughput, when compared to offline algorithms.
Keywords :
graph theory; pattern clustering; computational overhead; fundamental building block; graph clustering algorithm; graph reservoir sampling; large-scale streaming graphs clustering; Clustering algorithms; Complexity theory; Heuristic algorithms; Image edge detection; Partitioning algorithms; Reservoirs; Throughput; clustering; graph; online; streaming;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on
Conference_Location :
Macau
ISSN :
1063-6927
Print_ISBN :
978-1-4577-0295-2
Type :
conf
DOI :
10.1109/ICDCS.2012.20
Filename :
6258019
Link To Document :
بازگشت