• DocumentCode
    2488795
  • Title

    Cluster validation using a probabilistic attributed graph

  • Author

    Fred, Ana L N ; Jain, Anil K.

  • Author_Institution
    Inst. Super. Tecnico, Lisbon
  • fYear
    2008
  • fDate
    8-11 Dec. 2008
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    We propose a new cluster validity index. A data partition is described by a set of disjoint sub-graphs, each corresponding to the minimum spanning tree of a cluster, taking as edge weight the dissimilarity between linked objects. Based on the assumption that each cluster has a characteristic parametric distribution of dissimilarity increments, graph probabilities are estimated. The validity index is defined as the minimum description length for both estimated model parameters and data partition, according to this probabilistic model. This new index can be used to evaluate various partitions of a given data set obtained by: (i) a single clustering algorithm, (ii) different clustering algorithms, or (iii) cluster ensemble methods. Experimental evaluation of the proposed index on synthetic and real data taken from the UCI repository confirms the usefulness of the method in selecting good clustering solutions.
  • Keywords
    graph theory; pattern clustering; probability; trees (mathematics); characteristic parametric distribution; cluster spanning tree; cluster validation; cluster validity index; data partition; disjoint sub-graphs; dissimilarity increments; graph probability; minimum description length; probabilistic attributed graph; single clustering algorithm; Algorithm design and analysis; Clustering algorithms; Euclidean distance; Exponential distribution; Nearest neighbor searches; Parameter estimation; Partitioning algorithms; Statistics; Telecommunications; Tree graphs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
  • Conference_Location
    Tampa, FL
  • ISSN
    1051-4651
  • Print_ISBN
    978-1-4244-2174-9
  • Electronic_ISBN
    1051-4651
  • Type

    conf

  • DOI
    10.1109/ICPR.2008.4761787
  • Filename
    4761787