• DocumentCode
    2711406
  • Title

    A New Cluster Validity Index Based on Fuzzy Granulation-degranulation Criteria

  • Author

    Saha, Sriparna ; Bandyopadhyay, Sanghamitra

  • Author_Institution
    Indian Stat. Inst., Kolkata
  • fYear
    2007
  • fDate
    18-21 Dec. 2007
  • Firstpage
    353
  • Lastpage
    358
  • Abstract
    Identification of correct number of clusters and the corresponding partitioning are two important considerations in clustering. In this paper, a new fuzzy quantization-dequantization criterion is used to propose a cluster validity index named fuzzy vector quantization based validity index, FVQ index. This index identifies how well the formed cluster centers represent that particular data set. In general, most of the existing validity indices try to optimize the total variance of the partitioning which is a measure of compactness of the clusters so formed. Here a new kind of error function which reflects how well the formed cluster centers represent the whole data set is used as the goodness of the obtained partitioning. This error function is monotonically decreasing with increase in the number of clusters. Minimum separation between two cluster centers is used here to normalize the error function. The well-known genetic algorithm based K-means clustering algorithm (GAK-means) is used as the underlying partitioning technique. The number of clusters is varied from 2 to radicN where N is the total number of data points present in the data set and the values of the proposed validity index is noted down. The minimum value of the FVQ index over these radicN-1 partitions corresponds to the appropriate partitioning and the number of partitions as indicated by the validity index. Results on five artificially generated and three real-life data sets show the effectiveness of the proposed validity index. For the purpose of comparison the cluster number identified by a well-known cluster validity index, XB-index, for the above mentioned eight data sets are also reported.
  • Keywords
    data mining; fuzzy set theory; genetic algorithms; pattern classification; pattern clustering; vector quantisation; K-means clustering algorithm; XB-index; cluster validity index; error function; fuzzy granulation-degranulation criterion; fuzzy vector quantization; genetic algorithm; Books; Clustering algorithms; Clustering methods; Decoding; Genetic algorithms; Machine intelligence; Partitioning algorithms; Scattering; Vector quantization; Virtual manufacturing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on
  • Conference_Location
    Guwahati, Assam
  • Print_ISBN
    0-7695-3059-1
  • Type

    conf

  • DOI
    10.1109/ADCOM.2007.19
  • Filename
    4425996