DocumentCode
390903
Title
A parameterless method for efficiently discovering clusters of arbitrary shape in large datasets
Author
Foss, Andrew ; Zaïane, Osmar R.
Author_Institution
Alberta Univ., Edmonton, Alta., Canada
fYear
2002
fDate
2002
Firstpage
179
Lastpage
186
Abstract
Clustering is the problem of grouping data based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity. The problem Of clustering data sets is also known as unsupervised classification, since no class labels are given. However, all existing clustering algorithms require some parameters to steer the clustering process, such as the famous k for the number of expected clusters, which constitutes a supervision of a sort. We present in this paper a new, efficient, fast and scalable clustering algorithm that clusters over a range of resolutions and finds a potential optimum clustering without requiring any parameter input. Our experiments show that our algorithm outperforms most existing clustering algorithms in quality and speed for large data sets.
Keywords
data mining; minimisation; pattern clustering; arbitrarily shaped cluster discovery; clustering; efficient clustering algorithm; fast clustering algorithm; inter-group similarity minimization; intra-group similarity maximization; large datasets; parameterless method; scalable clustering algorithm; unsupervised classification; Clustering algorithms; Clustering methods; Gravity; Multi-stage noise shaping; Noise shaping; Partitioning algorithms; Scalability; Shape;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN
0-7695-1754-4
Type
conf
DOI
10.1109/ICDM.2002.1183901
Filename
1183901
Link To Document