DocumentCode
2208418
Title
minCEntropy: A Novel Information Theoretic Approach for the Generation of Alternative Clusterings
Author
Vinh, Nguyen Xuan ; Epps, Julien
Author_Institution
Sch. of Electr. Eng. & Telecommun., Univ. of New South Wales, Sydney, NSW, Australia
fYear
2010
fDate
13-17 Dec. 2010
Firstpage
521
Lastpage
530
Abstract
Traditional clustering has focused on creating a single good clustering solution, while modern, high dimensional data can often be interpreted, and hence clustered, in different ways. Alternative clustering aims at creating multiple clustering solutions that are both of high quality and distinctive from each other. Methods for alternative clustering can be divided into objective-function-oriented and data-transformation-oriented approaches. This paper presents a novel information theoretic-based, objective-function-oriented approach to generate alternative clusterings, in either an unsupervised or semi-supervised manner. We employ the conditional entropy measure for quantifying both clustering quality and distinctiveness, resulting in an analytically consistent combined criterion. Our approach employs a computationally efficient nonparametric entropy estimator, which does not impose any assumption on the probability distributions. We propose a partitional clustering algorithm, named minCEntropy, to concurrently optimize both clustering quality and distinctiveness. minCEntropy requires setting only some rather intuitive parameters, and performs competitively with existing methods for alternative clustering.
Keywords
entropy; nonparametric statistics; pattern clustering; statistical distributions; alternative clustering generation; data-transformation-oriented approach; information theoretic approach; minCEntropy; nonparametric entropy estimator; objective-function-oriented approach; partitional clustering algorithm; probability distributions; alternative clustering; clustering; information theoretic clustering; multi-objective optimization; transformation;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2010 IEEE 10th International Conference on
Conference_Location
Sydney, NSW
ISSN
1550-4786
Print_ISBN
978-1-4244-9131-5
Electronic_ISBN
1550-4786
Type
conf
DOI
10.1109/ICDM.2010.24
Filename
5694006
Link To Document