DocumentCode :
390903
Title :
A parameterless method for efficiently discovering clusters of arbitrary shape in large datasets
Author :
Foss, Andrew ; Zaïane, Osmar R.
Author_Institution :
Alberta Univ., Edmonton, Alta., Canada
fYear :
2002
fDate :
2002
Firstpage :
179
Lastpage :
186
Abstract :
Clustering is the problem of grouping data based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity. The problem Of clustering data sets is also known as unsupervised classification, since no class labels are given. However, all existing clustering algorithms require some parameters to steer the clustering process, such as the famous k for the number of expected clusters, which constitutes a supervision of a sort. We present in this paper a new, efficient, fast and scalable clustering algorithm that clusters over a range of resolutions and finds a potential optimum clustering without requiring any parameter input. Our experiments show that our algorithm outperforms most existing clustering algorithms in quality and speed for large data sets.
Keywords :
data mining; minimisation; pattern clustering; arbitrarily shaped cluster discovery; clustering; efficient clustering algorithm; fast clustering algorithm; inter-group similarity minimization; intra-group similarity maximization; large datasets; parameterless method; scalable clustering algorithm; unsupervised classification; Clustering algorithms; Clustering methods; Gravity; Multi-stage noise shaping; Noise shaping; Partitioning algorithms; Scalability; Shape;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-1754-4
Type :
conf
DOI :
10.1109/ICDM.2002.1183901
Filename :
1183901
Link To Document :
بازگشت