Title of article
GAKREM: A novel hybrid clustering algorithm
Author/Authors
Cao D. Nguyen، نويسنده , , Krzysztof J. Cios، نويسنده ,
Issue Information
روزنامه با شماره پیاپی سال 2008
Pages
23
From page
4205
To page
4227
Abstract
We introduce a novel clustering algorithm named GAKREM (Genetic Algorithm K-means Logarithmic Regression Expectation Maximization) that combines the best characteristics of the K-means and EM algorithms but avoids their weaknesses such as the need to specify a priori the number of clusters, termination in local optima, and lengthy computations. To achieve these goals, genetic algorithms for estimating parameters and initializing starting points for the EM are used first. Second, the log-likelihood of each configuration of parameters and the number of clusters resulting from the EM is used as the fitness value for each chromosome in the population. The novelty of GAKREM is that in each evolving generation it efficiently approximates the log-likelihood for each chromosome using logarithmic regression instead of running the conventional EM algorithm until its convergence. Another novelty is the use of K-means to initially assign data points to clusters. The algorithm is evaluated by comparing its performance with the conventional EM algorithm, the K-means algorithm, and the likelihood cross-validation technique on several datasets.
Keywords
Clustering , EM , k-means , Logarithmic regression , Likelihood cross-validation , GAKREM , Genetic algorithms
Journal title
Information Sciences
Serial Year
2008
Journal title
Information Sciences
Record number
1213449
Link To Document