DocumentCode :
123444
Title :
K-medoids clustering based on MapReduce and optimal search of medoids
Author :
Ying-ting Zhu ; Fu-zhang Wang ; Xing-hua Shan ; Xiao-yan Lv
Author_Institution :
Railway Technol. Res. Coll., China Acad. of Railway Sci., Beijing, China
fYear :
2014
fDate :
22-24 Aug. 2014
Firstpage :
573
Lastpage :
577
Abstract :
When there are noises and outliers in the data, the traditional k-medoids algorithm has good robustness, however, that algorithm is only suitable for medium and small data set for its complex calculation. MapReduce is a programming model for processing mass data and suitable for parallel computing of big data. Therefore, this paper proposed an improved algorithm based on MapReduce and optimal search of medoids to cluster big data. Firstly, according to the basic properties of triangular geometry, this paper reduced calculation of distances among data elements to help search medoids quickly and reduce the calculation complexity of k-medoids. Secondly, according to the working principle of MapReduce, Map function is responsible for calculating the distances between each data element and medoids, and assigns data elements to their clusters; Reduce function will check for the results from Map function, search new medoids by the optimal search strategy of medoids again, and return new results to Map function in the next MapReduce process. The experiment results showed that our algorithm in this paper has high efficiency and good effectiveness.
Keywords :
Big Data; parallel programming; pattern clustering; Big Data; MapReduce programming model; data elements; k-medoids algorithm; k-medoids clustering; mass data processing; medoid optimal search; parallel computing; reduce function; triangular geometry; Clustering algorithms; Computational modeling; Computers; MapReduce; cluster analysis; data mining; k-medoids; parallel algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science & Education (ICCSE), 2014 9th International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4799-2949-8
Type :
conf
DOI :
10.1109/ICCSE.2014.6926527
Filename :
6926527
Link To Document :
بازگشت