• DocumentCode
    2864749
  • Title

    A generic framework for efficient subspace clustering of high-dimensional data

  • Author

    Kriegel, Hans-Peter ; Kröger, Peer ; Renz, Matthias ; Wurst, Sebastian

  • Author_Institution
    Inst. for Comput. Sci., Munich Univ., Germany
  • fYear
    2005
  • fDate
    27-30 Nov. 2005
  • Abstract
    Subspace clustering has been investigated extensively since traditional clustering algorithms often fail to detect meaningful clusters in high-dimensional data spaces. Many recently proposed subspace clustering methods suffer from two severe problems: First, the algorithms typically scale exponentially with the data dimensionality and/or the subspace dimensionality of the clusters. Second, for performance reasons, many algorithms use a global density threshold for clustering, which is quite questionable since clusters in subspaces of significantly different dimensionality will most likely exhibit significantly varying densities. In this paper, we propose a generic framework to overcome these limitations. Our framework is based on an efficient filter-refinement architecture that scales at most quadratic w.r.t. the data dimensionality and the dimensionality of the subspace clusters. It can be applied to any clustering notions including notions that are based on a local density threshold. A broad experimental evaluation on synthetic and real-world data empirically shows that our method achieves a significant gain of runtime and quality in comparison to state-of-the-art subspace clustering algorithms.
  • Keywords
    data mining; pattern clustering; data dimensionality; filter-refinement architecture; high-dimensional data; subspace clustering; subspace dimensionality; Clustering algorithms; Clustering methods; Computer science; Data mining; Diseases; Gene expression; Partitioning algorithms; Principal component analysis; Runtime; Scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, Fifth IEEE International Conference on
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2278-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2005.5
  • Filename
    1565686