• DocumentCode
    2859636
  • Title

    Clustering High-Dimensional Data with Low-Order Neighbors

  • Author

    Zhao, Yanchang ; Zhang, Chengqi ; Shen, Yi-Dong

  • Author_Institution
    Univ. of Technology, Sydney, Australia
  • fYear
    2004
  • fDate
    20-24 Sept. 2004
  • Firstpage
    103
  • Lastpage
    109
  • Abstract
    Density-based and grid-based clustering are two main clustering approaches. The former is famous for its capability of discovering clusters of various shapes and eliminating noises, while the latter is well known for its high speed. Combination of the two approaches seems to provide better clustering results. To the best of our knowledge, however, all existing algorithms that combine density-based clustering and grid-based clustering take cells as atomic units, in the sense that either all objects in a cell belong to a cluster or no object in the cell belong to any cluster. This requires the cells to be small enough to ensure the fine resolution of results. In high-dimensional spaces, however, the number of cells can be very large when cells are small, which would make the clustering process extremely costly. On the other hand, the number of neighbors of a cell grows exponentially with the dimensionality of datasets, which makes the complexity increase further. In this paper, we present a new approach that takes objects (or points) as the atomic units, so that the restriction of cell size can be relaxed without degrading the resolution of clustering results. In addition, a concept of ith-order neighbors is introduced to avoid considering the exponential number of neighboring cells. By considering only low-order neighbors, our algorithm is very efficient while losing only a little bit of accuracy. Experiments on synthetic and public data show that our algorithm can cluster high-dimensional data effectively and efficiently.
  • Keywords
    Australia; Cloning; Clustering algorithms; Computer science; Degradation; Grid computing; Information technology; Noise shaping; Partitioning algorithms; Shape;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2004. WI 2004. Proceedings. IEEE/WIC/ACM International Conference on
  • Print_ISBN
    0-7695-2100-2
  • Type

    conf

  • DOI
    10.1109/WI.2004.10128
  • Filename
    1410790