• DocumentCode
    2208418
  • Title

    minCEntropy: A Novel Information Theoretic Approach for the Generation of Alternative Clusterings

  • Author

    Vinh, Nguyen Xuan ; Epps, Julien

  • Author_Institution
    Sch. of Electr. Eng. & Telecommun., Univ. of New South Wales, Sydney, NSW, Australia
  • fYear
    2010
  • fDate
    13-17 Dec. 2010
  • Firstpage
    521
  • Lastpage
    530
  • Abstract
    Traditional clustering has focused on creating a single good clustering solution, while modern, high dimensional data can often be interpreted, and hence clustered, in different ways. Alternative clustering aims at creating multiple clustering solutions that are both of high quality and distinctive from each other. Methods for alternative clustering can be divided into objective-function-oriented and data-transformation-oriented approaches. This paper presents a novel information theoretic-based, objective-function-oriented approach to generate alternative clusterings, in either an unsupervised or semi-supervised manner. We employ the conditional entropy measure for quantifying both clustering quality and distinctiveness, resulting in an analytically consistent combined criterion. Our approach employs a computationally efficient nonparametric entropy estimator, which does not impose any assumption on the probability distributions. We propose a partitional clustering algorithm, named minCEntropy, to concurrently optimize both clustering quality and distinctiveness. minCEntropy requires setting only some rather intuitive parameters, and performs competitively with existing methods for alternative clustering.
  • Keywords
    entropy; nonparametric statistics; pattern clustering; statistical distributions; alternative clustering generation; data-transformation-oriented approach; information theoretic approach; minCEntropy; nonparametric entropy estimator; objective-function-oriented approach; partitional clustering algorithm; probability distributions; alternative clustering; clustering; information theoretic clustering; multi-objective optimization; transformation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2010 IEEE 10th International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-9131-5
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2010.24
  • Filename
    5694006