• Title of article

    GAKREM: A novel hybrid clustering algorithm

  • Author/Authors

    Cao D. Nguyen، نويسنده , , Krzysztof J. Cios، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2008
  • Pages
    23
  • From page
    4205
  • To page
    4227
  • Abstract
    We introduce a novel clustering algorithm named GAKREM (Genetic Algorithm K-means Logarithmic Regression Expectation Maximization) that combines the best characteristics of the K-means and EM algorithms but avoids their weaknesses such as the need to specify a priori the number of clusters, termination in local optima, and lengthy computations. To achieve these goals, genetic algorithms for estimating parameters and initializing starting points for the EM are used first. Second, the log-likelihood of each configuration of parameters and the number of clusters resulting from the EM is used as the fitness value for each chromosome in the population. The novelty of GAKREM is that in each evolving generation it efficiently approximates the log-likelihood for each chromosome using logarithmic regression instead of running the conventional EM algorithm until its convergence. Another novelty is the use of K-means to initially assign data points to clusters. The algorithm is evaluated by comparing its performance with the conventional EM algorithm, the K-means algorithm, and the likelihood cross-validation technique on several datasets.
  • Keywords
    Clustering , EM , k-means , Logarithmic regression , Likelihood cross-validation , GAKREM , Genetic algorithms
  • Journal title
    Information Sciences
  • Serial Year
    2008
  • Journal title
    Information Sciences
  • Record number

    1213449