Title :
P3C: A Robust Projected Clustering Algorithm
Author :
Moise, Gabriela ; Sander, Jörg ; Ester, Martin
Author_Institution :
Dept. of Comput. Sci., Alberta Univ., Edmonton, AB
Abstract :
Projected clustering has emerged as a possible solution to the challenges associated with clustering in high dimensional data. A projected cluster is a subset of points together with a subset of attributes, such that the cluster points project onto a small range of values in each of these attributes, and are uniformly distributed in the remaining attributes. Existing algorithms for projected clustering rely on parameters whose appropriate values are difficult to set by the user, or are unable to identify projected clusters with few relevant attributes. In this paper, we present a robust algorithm for projected clustering that can effectively discover projected clusters in the data while minimizing the number of parameters required as input. In contrast to all previous approaches, our algorithm can discover, under very general conditions, the true number of projected clusters. We show through an extensive experimental evaluation that our algorithm: (1) significantly outperforms existing algorithms for projected clustering in terms of accuracy; (2) is effective in detecting very low-dimensional projected clusters embedded in high dimensional spaces; (3) is effective in detecting clusters with varying orientation in their relevant subspaces; (4) is scalable with respect to large data sets and high number of dimensions.
Keywords :
pattern clustering; very large databases; extensive experimental evaluation; high dimensional data; large data sets; robust algorithm; robust projected clustering algorithm; Clustering algorithms; Data models; Databases; Nearest neighbor searches; Principal component analysis; Robustness;
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2701-7
DOI :
10.1109/ICDM.2006.123