• DocumentCode
    2209390
  • Title

    Subspace Clustering Meets Dense Subgraph Mining: A Synthesis of Two Paradigms

  • Author

    Günnemann, Stephan ; Farber, Ines ; Boden, Brigitte ; Seidl, Thomas

  • Author_Institution
    RWTH Aachen Univ., Aachen, Germany
  • fYear
    2010
  • fDate
    13-17 Dec. 2010
  • Firstpage
    845
  • Lastpage
    850
  • Abstract
    Today´s applications deal with multiple types of information: graph data to represent the relations between objects and attribute data to characterize single objects. Analyzing both data sources simultaneously can increase the quality of mining methods. Recently, combined clustering approaches were introduced, which detect densely connected node sets within one large graph that also show high similarity according to all of their attribute values. However, for attribute data it is known that this full-space clustering often leads to poor clustering results. Thus, subspace clustering was introduced to identify locally relevant subsets of attributes for each cluster. In this work, we propose a method for finding homogeneous groups by joining the paradigms of subspace clustering and dense sub graph mining, i.e. we determine sets of nodes that show high similarity in subsets of their dimensions and that are as well densely connected within the given graph. Our twofold clusters are optimized according to their density, size, and number of relevant dimensions. Our developed redundancy model confines the clustering to a manageable size of only the most interesting clusters. We introduce the algorithm Gamer for the efficient calculation of our clustering. In thorough experiments on synthetic and real world data we show that Gamer achieves low runtimes and high clustering qualities.
  • Keywords
    data mining; graph theory; optimisation; pattern clustering; redundancy; data attribute; data mining; dense subgraph mining; graph data; optimization; redundancy; subspace clustering; attribute data; combined clustering approach; dense subgraph mining; graph data; redundancy removal; subspace clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2010 IEEE 10th International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-9131-5
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2010.95
  • Filename
    5694049