• DocumentCode
    2731751
  • Title

    Distance Based Subspace Clustering with Flexible Dimension Partitioning

  • Author

    Liu, Guimei ; Li, Jinyan ; Sim, Kelvin ; Wong, Limsoon

  • Author_Institution
    Nat. Univ. of Singapore
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Firstpage
    1250
  • Lastpage
    1254
  • Abstract
    Traditional similarity or distance measurements usually become meaningless when the dimensions of the datasets increase, which has detrimental effects on clustering performance. In this paper, we propose a distance-based subspace clustering model, called nCluster, to find groups of objects that have similar values on subsets of dimensions. Instead of using a grid based approach to partition the data space into non-overlapping rectangle cells as in the density based subspace clustering algorithms, the nCluster model uses a more flexible method to partition the dimensions to preserve meaningful and significant clusters. We develop an efficient algorithm to mine only maximal nClusters. A set of experiments are conducted to show the efficiency of the proposed algorithm and the effectiveness of the new model in preserving significant clusters.
  • Keywords
    data mining; database theory; pattern clustering; data mining; distance based subspace clustering; flexible dimension partitioning; nCluster; Clustering algorithms; Distance measurement; Kelvin; Merging; Partitioning algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
  • Conference_Location
    Istanbul
  • Print_ISBN
    1-4244-0802-4
  • Electronic_ISBN
    1-4244-0803-2
  • Type

    conf

  • DOI
    10.1109/ICDE.2007.368985
  • Filename
    4221775