Title :
An improved Fuzzy C-Means algorithm based on MapReduce
Author :
Qing Yu;Zhimin Ding
Author_Institution :
Tianjin Key Laboratory of Intelligence Computing and Network Security, Tianjin University of Technology, Tianjin, China
Abstract :
In order to solve the problem that the Fuzzy C-Means algorithm is sensitive to the initial clustering center, we use the Canopy algorithm to carry out the quick and rough clustering. At the same time, to avoid the blindness of the Canopy algorithm, we put forward an improved Canopy-FCM algorithm based on a max-min principle. In allusion to the problem that the FCM algorithm has high time complexity, this article use the parallel computing frame of MapReduce to design and realize the improved Canopy-FCM algorithm. Experimental results show: the improved Canopy-FCM algorithm based on MapReduce has better clustering quality and running speed than the Canopy-FCM and the FCM algorithm based on MapReduce, and the improved Canopy-FCM algorithm based on Hadoop has better speed-up ratio than the Canopy-FCM based on the Standalone mode.
Keywords :
"Clustering algorithms","Algorithm design and analysis","Data collection","Convergence","Classification algorithms","Blindness","Time complexity"
Conference_Titel :
Biomedical Engineering and Informatics (BMEI), 2015 8th International Conference on
DOI :
10.1109/BMEI.2015.7401581