DocumentCode
3455538
Title
Finding the Optimal Number of Clusters from Artificial Datasets
Author
Päivinen, Niina ; Grönfors, Tapio
Author_Institution
Dept. of Comput. Sci., Kuopio Univ., Kuopio
fYear
2006
fDate
20-22 Aug. 2006
Firstpage
1
Lastpage
6
Abstract
This study deals with the problem of selecting the right number of clusters. Scale-free minimum spanning trees (SFMSTs) were constructed from the artificial test datasets, and the number of clusters, based on the distribution of the edge lengths, as well as the clustering itself was obtained from the structure. As a reference, the nearest neighbor and k-means clustering methods were used, and the number of clusters was determined with the largest average silhouette width criterium. The SFMST clustering mehtod proved to be a method which is able to automatically find the optimal number of clusters from the dataset without using any user-defined parameters.
Keywords
pattern clustering; statistical distributions; trees (mathematics); artificial dataset; edge length distribution; k-means clustering method; largest average silhouette width criterium; nearest neighbor clustering method; optimal cluster selection problem; probability distribution; scale-free minimum spanning tree; Bridges; Clustering methods; Computer science; Data analysis; Histograms; Joining processes; Nearest neighbor searches; Probability distribution; Testing; Tree graphs;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Cybernetics, 2006. ICCC 2006. IEEE International Conference on
Conference_Location
Budapest
Print_ISBN
1-4244-0071-6
Electronic_ISBN
1-4244-0072-4
Type
conf
DOI
10.1109/ICCCYB.2006.305691
Filename
4097652
Link To Document