DocumentCode
3165460
Title
Efficient Algorithms for Mining Significant Substructures in Graphs with Quality Guarantees
Author
He, Huahai ; Singh, Ambuj K.
Author_Institution
Univ. of California, Santa Barbara
fYear
2007
fDate
28-31 Oct. 2007
Firstpage
163
Lastpage
172
Abstract
Graphs have become popular for modeling scientific data in recent years. As a result, techniques for mining graphs are extremely important for understanding inherent data and domain characteristics. One such exploratory mining paradigm is the k-MST (minimum spanning tree over k vertices) problem that can be used to discover significant local substructures. In this paper, we present an efficient approximation algorithm for the k-MST problem in large graphs. The algorithm has an O(radic/k) approximation ratio and O(n log n + in log m log k + nk2 log k) running time, where n and m are the number of vertices and edges respectively. Experimental results on synthetic graphs and protein interaction networks show that the algorithm is scalable to large graphs and useful for discovering biological pathways. The highlight of the algorithm is that it offers both analytical guarantees and empirical evidence of good running time and quality.
Keywords
approximation theory; computational complexity; data mining; natural sciences computing; trees (mathematics); approximation algorithm; data mining; large graph theory; minimum spanning tree; scientific data modeling; Algorithm design and analysis; Approximation algorithms; Biological system modeling; Clustering algorithms; Dynamic programming; Frequency; Layout; Proteins; Social network services; Tree graphs;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location
Omaha, NE
ISSN
1550-4786
Print_ISBN
978-0-7695-3018-5
Type
conf
DOI
10.1109/ICDM.2007.11
Filename
4470240
Link To Document