DocumentCode
2334468
Title
Clustering validity assessment: finding the optimal partitioning of a data set
Author
Halkidi, Maria ; Vazirgiannis, Michalis
fYear
2001
fDate
2001
Firstpage
187
Lastpage
194
Abstract
Clustering is a mostly unsupervised procedure and the majority of clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort of evaluation regarding its validity. In this paper we present a clustering validity procedure, which evaluates the results of clustering algorithms on data sets. We define a validity index, S Dbw, based on well-defined clustering criteria enabling the selection of optimal input parameter values for a clustering algorithm that result in the best partitioning of a data set. We evaluate the reliability of our index both theoretically and experimentally, considering three representative clustering algorithms run on synthetic and real data sets. We also carried out an evaluation study to compare S Dbw performance with other known validity indices. Our approach performed favorably in all cases, even those in which other indices failed to indicate the correct partitions in a data set
Keywords
data mining; pattern clustering; SDbw validity index; clustering algorithms; clustering validity assessment; optimal partitioning data set; reliability; Clustering algorithms; Data visualization; Geometry; Humans; Informatics; Multidimensional systems; Partitioning algorithms; Radio access networks; Reliability theory; Visual perception;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Conference_Location
San Jose, CA
Print_ISBN
0-7695-1119-8
Type
conf
DOI
10.1109/ICDM.2001.989517
Filename
989517
Link To Document