DocumentCode
2859636
Title
Clustering High-Dimensional Data with Low-Order Neighbors
Author
Zhao, Yanchang ; Zhang, Chengqi ; Shen, Yi-Dong
Author_Institution
Univ. of Technology, Sydney, Australia
fYear
2004
fDate
20-24 Sept. 2004
Firstpage
103
Lastpage
109
Abstract
Density-based and grid-based clustering are two main clustering approaches. The former is famous for its capability of discovering clusters of various shapes and eliminating noises, while the latter is well known for its high speed. Combination of the two approaches seems to provide better clustering results. To the best of our knowledge, however, all existing algorithms that combine density-based clustering and grid-based clustering take cells as atomic units, in the sense that either all objects in a cell belong to a cluster or no object in the cell belong to any cluster. This requires the cells to be small enough to ensure the fine resolution of results. In high-dimensional spaces, however, the number of cells can be very large when cells are small, which would make the clustering process extremely costly. On the other hand, the number of neighbors of a cell grows exponentially with the dimensionality of datasets, which makes the complexity increase further. In this paper, we present a new approach that takes objects (or points) as the atomic units, so that the restriction of cell size can be relaxed without degrading the resolution of clustering results. In addition, a concept of ith-order neighbors is introduced to avoid considering the exponential number of neighboring cells. By considering only low-order neighbors, our algorithm is very efficient while losing only a little bit of accuracy. Experiments on synthetic and public data show that our algorithm can cluster high-dimensional data effectively and efficiently.
Keywords
Australia; Cloning; Clustering algorithms; Computer science; Degradation; Grid computing; Information technology; Noise shaping; Partitioning algorithms; Shape;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence, 2004. WI 2004. Proceedings. IEEE/WIC/ACM International Conference on
Print_ISBN
0-7695-2100-2
Type
conf
DOI
10.1109/WI.2004.10128
Filename
1410790
Link To Document