DocumentCode :
1411199
Title :
Efficient Algorithms for Summarizing Graph Patterns
Author :
Li, Jianzhong ; Liu, Yong ; Gao, Hong
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
Volume :
23
Issue :
9
fYear :
2011
Firstpage :
1388
Lastpage :
1405
Abstract :
We investigate the problem of summarizing frequent subgraphs by a smaller set of representative patterns. We show that some special graph patterns, called δ-jump patterns in this paper, must be representative patterns. Based on the fact, we devise two algorithms, RP-FP and RP-GD, to mine a representative set that summarizes frequent subgraphs. RP-FP derives a representative set from frequent closed subgraphs, whereas RP-GD mines a representative set from graph databases directly. Three novel heuristic strategies, Last-Succeed-First-Check, Reverse-Path-Trace, and Nephew-Representative-Based-Cover, are proposed to further improve the efficiency of RP-GD. RP-FP can provide a tight ratio bound but has heavy computation cost. RP-GD cannot provide a ratio bound guarantee but is more efficient than RP-FP. We also make use of the similarity between sibling branches in the graph pattern space to devise another much more efficient algorithm, RP-Leap, for mining a representative set that can approximately summarize frequent subgraphs. Our extensive experiments on both real and synthetic data sets verify the summarization quality and efficiency of our algorithms. To further demonstrate the interestingness of representative patterns, we study an application of representative patterns to classification. We demonstrate that the classification accuracy achieved by representative pattern-based model is no less than that achieved by closed graph pattern-based model.
Keywords :
graph theory; pattern classification; graph database; graph pattern; pattern classification; synthetic data set; Algorithm design and analysis; Approximation algorithms; Context; Data mining; Itemsets; Testing; Data mining; graph mining; pattern summarization.;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2010.249
Filename :
5674039
Link To Document :
بازگشت