DocumentCode :
3756817
Title :
MDL-based Hierarchical Clustering
Author :
Zdravko Markov
Author_Institution :
Comput. Sci. Dept., Central Connecticut State Univ., New Britain, CT, USA
fYear :
2015
Firstpage :
471
Lastpage :
474
Abstract :
This paper presents a new hierarchical clustering algorithm based on the use of the Minimum Description Length (MDL) principle. The clusters are created by recursively splitting the data using the values of an attribute (similarly to decision tree learning), so that each cluster contains the instances that have the same value for this attribute. Attributes are chosen to minimize the MDL evaluation measure of the clustering they create. The algorithm´s computational complexity is linear in the number of data instances and quadratic in the total number of different attribute-values in the data and can be substantially reduced by an efficient implementation using bit-level parallelism. We empirically evaluate the algorithm on 20 datasets from the UCI ML repository and show that it compares favorably to k-means and EM.
Keywords :
"Clustering algorithms","Classification algorithms","Decision trees","Algorithm design and analysis","Encoding","Computational complexity","Entropy"
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on
Type :
conf
DOI :
10.1109/ICMLA.2015.95
Filename :
7424360
Link To Document :
بازگشت