• DocumentCode
    3585930
  • Title

    A parallel sampling-PSO-multi-core-K-means algorithm using mapreduce

  • Author

    Bousbaci, Abdelhak ; Kamel, Nadjet

  • Author_Institution
    Comput. Sci. Dept., USTHB, Algiers, Algeria
  • fYear
    2014
  • Firstpage
    129
  • Lastpage
    134
  • Abstract
    Clustering is partitioning data into groups, such that data in the same group are similar. Many clustering algorithms are proposed in the literature. K-means is the most used one because of its implementation simplicity and efficiency. Many clustering algorithms are based on the K-means algorithms aiming to improve execution time or clustering quality or both of them. Improving clustering quality can be done by an optimal selection of the initial centroids using for example meta-heuristics. Improving execution time can be performed using parallelism. In this paper, we propose a parallel hybrid K-means based on Google´s MapReduce framework for the parallelism and the PSO meta-heuristics for the choice of the initial centroids. This algorithm is used to cluster multi-dimensional data sets. The results proved that using a network of machines to process data improves the execution time and the clustering quality.
  • Keywords
    data handling; multiprocessing systems; parallel algorithms; particle swarm optimisation; pattern clustering; Google MapReduce framework; PSO metaheuristics; clustering algorithm; clustering quality; multidimensional data set; optimal selection; parallel hybrid k-means; parallel sampling-PSO-multicore-k-means algorithm; partitioning data; Algorithm design and analysis; Clustering algorithms; Heuristic algorithms; Instruction sets; Message systems; Parallel processing; Partitioning algorithms; K-means; MapReduce; PSO; Sampling; Shared memory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hybrid Intelligent Systems (HIS), 2014 14th International Conference on
  • Print_ISBN
    978-1-4799-7632-4
  • Type

    conf

  • DOI
    10.1109/HIS.2014.7086185
  • Filename
    7086185