• DocumentCode
    3165460
  • Title

    Efficient Algorithms for Mining Significant Substructures in Graphs with Quality Guarantees

  • Author

    He, Huahai ; Singh, Ambuj K.

  • Author_Institution
    Univ. of California, Santa Barbara
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    163
  • Lastpage
    172
  • Abstract
    Graphs have become popular for modeling scientific data in recent years. As a result, techniques for mining graphs are extremely important for understanding inherent data and domain characteristics. One such exploratory mining paradigm is the k-MST (minimum spanning tree over k vertices) problem that can be used to discover significant local substructures. In this paper, we present an efficient approximation algorithm for the k-MST problem in large graphs. The algorithm has an O(radic/k) approximation ratio and O(n log n + in log m log k + nk2 log k) running time, where n and m are the number of vertices and edges respectively. Experimental results on synthetic graphs and protein interaction networks show that the algorithm is scalable to large graphs and useful for discovering biological pathways. The highlight of the algorithm is that it offers both analytical guarantees and empirical evidence of good running time and quality.
  • Keywords
    approximation theory; computational complexity; data mining; natural sciences computing; trees (mathematics); approximation algorithm; data mining; large graph theory; minimum spanning tree; scientific data modeling; Algorithm design and analysis; Approximation algorithms; Biological system modeling; Clustering algorithms; Dynamic programming; Frequency; Layout; Proteins; Social network services; Tree graphs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
  • Conference_Location
    Omaha, NE
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3018-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2007.11
  • Filename
    4470240