DocumentCode
2731751
Title
Distance Based Subspace Clustering with Flexible Dimension Partitioning
Author
Liu, Guimei ; Li, Jinyan ; Sim, Kelvin ; Wong, Limsoon
Author_Institution
Nat. Univ. of Singapore
fYear
2007
fDate
15-20 April 2007
Firstpage
1250
Lastpage
1254
Abstract
Traditional similarity or distance measurements usually become meaningless when the dimensions of the datasets increase, which has detrimental effects on clustering performance. In this paper, we propose a distance-based subspace clustering model, called nCluster, to find groups of objects that have similar values on subsets of dimensions. Instead of using a grid based approach to partition the data space into non-overlapping rectangle cells as in the density based subspace clustering algorithms, the nCluster model uses a more flexible method to partition the dimensions to preserve meaningful and significant clusters. We develop an efficient algorithm to mine only maximal nClusters. A set of experiments are conducted to show the efficiency of the proposed algorithm and the effectiveness of the new model in preserving significant clusters.
Keywords
data mining; database theory; pattern clustering; data mining; distance based subspace clustering; flexible dimension partitioning; nCluster; Clustering algorithms; Distance measurement; Kelvin; Merging; Partitioning algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location
Istanbul
Print_ISBN
1-4244-0802-4
Electronic_ISBN
1-4244-0803-2
Type
conf
DOI
10.1109/ICDE.2007.368985
Filename
4221775
Link To Document