DocumentCode
2971327
Title
Learning the Threshold in Hierarchical Agglomerative Clustering
Author
Daniels, Kristine ; Giraud-Carrier, Christophe
Author_Institution
Dept. of Comput. Sci., Brigham Young Univ., Provo, UT
fYear
2006
fDate
Dec. 2006
Firstpage
270
Lastpage
278
Abstract
Most partitional clustering algorithms require the number of desired clusters to be set a priori. Not only is this somewhat counter-intuitive, it is also difficult except in the simplest of situations. By contrast, hierarchical clustering may create partitions with varying numbers of clusters. The actual final partition depends on a threshold placed on the similarity measure used. Given a cluster quality metric, one can efficiently discover an appropriate threshold through a form of semi-supervised learning. This paper shows one such solution for complete-link hierarchical agglomerative clustering using the F-measure and a small subset of labeled examples. Empirical evaluation demonstrates promise
Keywords
learning (artificial intelligence); pattern clustering; hierarchical agglomerative clustering algorithm; semisupervised learning algorithm; Clustering algorithms; Computer science; Data mining; Data visualization; Euclidean distance; Iterative algorithms; Merging; Partitioning algorithms; Semisupervised learning; Taxonomy;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications, 2006. ICMLA '06. 5th International Conference on
Conference_Location
Orlando, FL
Print_ISBN
0-7695-2735-3
Type
conf
DOI
10.1109/ICMLA.2006.33
Filename
4041503
Link To Document