• DocumentCode
    2774875
  • Title

    Enhancing the K-means Clustering Algorithm by Using a O(n logn) Heuristic Method for Finding Better Initial Centroids

  • Author

    Nazeer, K. A. Abdul ; Kumar, S. D Madhu ; Sebastian, M.P.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Nat. Inst. of Technol. Calicut, Calicut, India
  • fYear
    2011
  • fDate
    19-20 Feb. 2011
  • Firstpage
    261
  • Lastpage
    264
  • Abstract
    With the advent of modern techniques for scientific data collection, large quantities of data are getting accumulated at various databases. Systematic data analysis methods are necessary to extract useful information from rapidly growing data banks. Cluster analysis is one of the major data mining methods and the k-means clustering algorithm is widely used for many practical applications. But the original k-means algorithm is computationally expensive and the quality of the resulting clusters substantially relies on the choice of initial centroids. Several methods have been proposed in the literature for improving the performance of the k-means algorithm. This paper proposes an improvement on the classic k-means algorithm to produce more accurate clusters. The proposed algorithm comprises of a O(n logn) heuristic method, based on sorting and partitioning the input data, for finding the initial centroids in accordance with the data distribution. Experimental results show that the proposed algorithm produces better clusters in less computation time.
  • Keywords
    data analysis; data mining; pattern clustering; O(n logn) heuristic method; cluster analysis; data analysis methods; data banks; data distribution; data mining methods; data partitioning; data sorting; information extraction; initial centroid; k-means clustering algorithm enhancement; Accuracy; Algorithm design and analysis; Clustering algorithms; Data mining; Heuristic algorithms; Machine learning algorithms; Partitioning algorithms; Clustering; Data Mining; Enhanced k-means Algorithm; Improved Initial Centroids; Sorting and Partitioning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Emerging Applications of Information Technology (EAIT), 2011 Second International Conference on
  • Conference_Location
    Kolkata
  • Print_ISBN
    978-1-4244-9683-9
  • Type

    conf

  • DOI
    10.1109/EAIT.2011.57
  • Filename
    5734940