• DocumentCode
    2971327
  • Title

    Learning the Threshold in Hierarchical Agglomerative Clustering

  • Author

    Daniels, Kristine ; Giraud-Carrier, Christophe

  • Author_Institution
    Dept. of Comput. Sci., Brigham Young Univ., Provo, UT
  • fYear
    2006
  • fDate
    Dec. 2006
  • Firstpage
    270
  • Lastpage
    278
  • Abstract
    Most partitional clustering algorithms require the number of desired clusters to be set a priori. Not only is this somewhat counter-intuitive, it is also difficult except in the simplest of situations. By contrast, hierarchical clustering may create partitions with varying numbers of clusters. The actual final partition depends on a threshold placed on the similarity measure used. Given a cluster quality metric, one can efficiently discover an appropriate threshold through a form of semi-supervised learning. This paper shows one such solution for complete-link hierarchical agglomerative clustering using the F-measure and a small subset of labeled examples. Empirical evaluation demonstrates promise
  • Keywords
    learning (artificial intelligence); pattern clustering; hierarchical agglomerative clustering algorithm; semisupervised learning algorithm; Clustering algorithms; Computer science; Data mining; Data visualization; Euclidean distance; Iterative algorithms; Merging; Partitioning algorithms; Semisupervised learning; Taxonomy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications, 2006. ICMLA '06. 5th International Conference on
  • Conference_Location
    Orlando, FL
  • Print_ISBN
    0-7695-2735-3
  • Type

    conf

  • DOI
    10.1109/ICMLA.2006.33
  • Filename
    4041503