• DocumentCode
    1301077
  • Title

    Multiobjective Genetic Algorithm-Based Fuzzy Clustering of Categorical Attributes

  • Author

    Mukhopadhyay, Anirban ; Maulik, Ujjwal ; Bandyopadhyay, Sanghamitra

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Kalyani, Kalyani, India
  • Volume
    13
  • Issue
    5
  • fYear
    2009
  • Firstpage
    991
  • Lastpage
    1005
  • Abstract
    Recently, the problem of clustering categorical data, where no natural ordering among the elements of a categorical attribute domain can be found, has been gaining significant attention from researchers. With the growing demand for categorical data clustering, a few clustering algorithms with focus on categorical data have recently been developed. However, most of these methods attempt to optimize a single measure of the clustering goodness. Often, such a single measure may not be appropriate for different kinds of datasets. Thus, consideration of multiple, often conflicting, objectives appears to be natural for this problem. Although we have previously addressed the problem of multiobjective fuzzy clustering for continuous data, these algorithms cannot be applied for categorical data where the cluster means are not defined. Motivated by this, in this paper a multiobjective genetic algorithm-based approach for fuzzy clustering of categorical data is proposed that encodes the cluster modes and simultaneously optimizes fuzzy compactness and fuzzy separation of the clusters. Moreover, a novel method for obtaining the final clustering solution from the set of resultant Pareto-optimal solutions in proposed. This is based on majority voting among Pareto front solutions followed by k-nn classification. The performance of the proposed fuzzy categorical data-clustering techniques has been compared with that of some other widely used algorithms, both quantitatively and qualitatively. For this purpose, various synthetic and real-life categorical datasets have been considered. Also, a statistical significance test has been conducted to establish the significant superiority of the proposed multiobjective approach.
  • Keywords
    Pareto optimisation; fuzzy set theory; genetic algorithms; pattern clustering; Pareto-optimal solutions; categorical attributes; categorical data; data-clustering techniques; fuzzy compactness; fuzzy separation; k-nn classification; multiobjective fuzzy clustering; multiobjective genetic algorithm; Categorical attributes; fuzzy clustering; multiobjective genetic algorithm; pareto optimality;
  • fLanguage
    English
  • Journal_Title
    Evolutionary Computation, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1089-778X
  • Type

    jour

  • DOI
    10.1109/TEVC.2009.2012163
  • Filename
    5208225