Title :
Optimization of the distributed K-means clustering algorithm based on set pair analysis
Author :
Song Ling;Qi Yunfeng
Author_Institution :
School of Computer & Electronic Information, Guangxi University, Nanning, China
Abstract :
The distributed K-means cluster algorithm which focused on multidimensional data has been widely used. However, the current distributed K-means clustering algorithm uses the Euclidean distance as the similarity degree comparison of multidimensional data, which makes the algorithm divides the data set relatively stiff. Aiming at this problem, we present a distributed k-means clustering algorithm(SPAB-DKMC) based on the method of set pair analysis. The results of experiments on the Hadoop distributed platform show that SPAB-DKMC can reduce the number of iterations and improve the efficiency of the distributed K-means clustering algorithm.
Keywords :
"Clustering algorithms","Algorithm design and analysis","Signal processing algorithms","Euclidean distance","Distributed databases","Classification algorithms","Convergence"
Conference_Titel :
Image and Signal Processing (CISP), 2015 8th International Congress on
DOI :
10.1109/CISP.2015.7408139