• DocumentCode
    130938
  • Title

    Parallel K-Medoids clustering algorithm based on Hadoop

  • Author

    Yaobin Jiang ; Jiongmin Zhang

  • Author_Institution
    Dept. of Comput. Sci. & Technol., East China Normal Univ., Shanghai, China
  • fYear
    2014
  • fDate
    27-29 June 2014
  • Firstpage
    649
  • Lastpage
    652
  • Abstract
    The K-Medoids clustering algorithm solves the problem of the K-Means algorithm on processing the outlier samples, but it is not be able to process big-data because of the time complexity[1]. MapReduce is a parallel programming model for processing big-data, and has been implemented in Hadoop. In order to break the big-data limits, the parallel K-Medoids algorithm HK-Medoids based on Hadoop was proposed. Every submitted job has many iterative MapReduce procedures: In the map phase, each sample was assigned to one cluster whose center is the most similar with the sample; in the combine phase, an intermediate center for each cluster was calculated; and in the reduce phase, the new center was calculated. The iterator stops when the new center is similar to the old one. The experimental results showed that HK-Medoids algorithm has a good clustering result and linear speedup for big-data.
  • Keywords
    Big Data; computational complexity; iterative methods; parallel programming; pattern clustering; HK-medoids; Hadoop; big data processing; iterative MapReduce procedures; k-means algorithm; map phase; outlier sample processing; parallel k-medoids clustering algorithm; parallel programming model; time complexity; Algorithm design and analysis; Clustering algorithms; Computational modeling; Educational institutions; Indexes; Partitioning algorithms; Programming; Big-Data; Clustering Analysis; Hadoop; K-Medoids; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
  • Conference_Location
    Beijing
  • ISSN
    2327-0586
  • Print_ISBN
    978-1-4799-3278-8
  • Type

    conf

  • DOI
    10.1109/ICSESS.2014.6933652
  • Filename
    6933652