• DocumentCode
    123367
  • Title

    A K-means clustering with optimized initial center based on Hadoop platform

  • Author

    Kunhui Lin ; Xiang Li ; Zhongnan Zhang ; Jiahong Chen

  • Author_Institution
    Software Sch., Xiamen Univ., Xiamen, China
  • fYear
    2014
  • fDate
    22-24 Aug. 2014
  • Firstpage
    263
  • Lastpage
    266
  • Abstract
    With the explosive growth of data, the traditional clustering algorithms running on separate servers can not meet the demand. To solve the problem, more and more researchers implement the traditional clustering algorithms on the cloud computing platforms, especially for K-means clustering. But, few researchers pay attention to the K-means clustering structure, and most of researchers optimized the model of the cloud computing platform to raise the computing speed of K-means clustering. However the problem of instability caused by the random initial centers still exists. In this paper, we propose a K-means clustering algorithm with optimized initial centers based on data dimensional density. This method avoids the deficiency of the random initial centers and improves the stability of the K-means clustering. The experimental results show that the approach achieves a good performance on K-means, and improves the accuracy of K-means clustering on the test set.
  • Keywords
    cloud computing; data handling; pattern clustering; Hadoop platform; K-means clustering; MapReduce programming model; cloud computing platforms; data dimensional density; optimized initial center; random initial centers; Clustering algorithms; Computers; Educational institutions; Density; Initial center; K-means clustering; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science & Education (ICCSE), 2014 9th International Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    978-1-4799-2949-8
  • Type

    conf

  • DOI
    10.1109/ICCSE.2014.6926466
  • Filename
    6926466