• Title of article

    A comparative study of efficient initialization methods for the k-means clustering algorithm

  • Author/Authors

    Celebi، نويسنده , , M. Emre and Kingravi، نويسنده , , Hassan A. and Vela، نويسنده , , Patricio A.، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2013
  • Pages
    11
  • From page
    200
  • To page
    210
  • Abstract
    K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.
  • Keywords
    Partitional clustering , Sum of squared error criterion , K-Means , Cluster center initialization
  • Journal title
    Expert Systems with Applications
  • Serial Year
    2013
  • Journal title
    Expert Systems with Applications
  • Record number

    2352895