• DocumentCode
    2334468
  • Title

    Clustering validity assessment: finding the optimal partitioning of a data set

  • Author

    Halkidi, Maria ; Vazirgiannis, Michalis

  • fYear
    2001
  • fDate
    2001
  • Firstpage
    187
  • Lastpage
    194
  • Abstract
    Clustering is a mostly unsupervised procedure and the majority of clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort of evaluation regarding its validity. In this paper we present a clustering validity procedure, which evaluates the results of clustering algorithms on data sets. We define a validity index, S Dbw, based on well-defined clustering criteria enabling the selection of optimal input parameter values for a clustering algorithm that result in the best partitioning of a data set. We evaluate the reliability of our index both theoretically and experimentally, considering three representative clustering algorithms run on synthetic and real data sets. We also carried out an evaluation study to compare S Dbw performance with other known validity indices. Our approach performed favorably in all cases, even those in which other indices failed to indicate the correct partitions in a data set
  • Keywords
    data mining; pattern clustering; SDbw validity index; clustering algorithms; clustering validity assessment; optimal partitioning data set; reliability; Clustering algorithms; Data visualization; Geometry; Humans; Informatics; Multidimensional systems; Partitioning algorithms; Radio access networks; Reliability theory; Visual perception;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    0-7695-1119-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2001.989517
  • Filename
    989517