Title of article :
Entropy-based Consensus for Distributed Data Clustering
Author/Authors :
Akbarzadeh-T, M.R Department of Computer Engineering - Center of Excellence on Soft Computing and Intelligent Information Processing - Ferdowsi University of Mashhad - Mashhad, Iran , Owhadi-Kareshk, M Department of Computer Engineering - Center of Excellence on Soft Computing and Intelligent Information Processing - Ferdowsi University of Mashhad - Mashhad, Iran
Pages :
11
From page :
551
To page :
561
Abstract :
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with consideration for the confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in the consensus process, hence no private data is transferred. With the proposed use of entropy as an internal measure of consensus clustering validation at each machine, the cluster centers of the local machines with higher expected clustering validity have more influence on the final consensus centers. We also employ the relative cost function of the local Fuzzy C-Means (FCM) and the number of data points in each machine as measures of relative machine validity as compared to other machines and its reliability, respectively. The utility of the proposed consensus strategy is examined on 18 datasets from the UCI repository in terms of clustering accuracy and speed-up against the centralized version of FCM. Several experiments confirm that the proposed approach yields to higher speed-up and accuracy, while maintaining data security due to its protected and distributed processing approach.
Keywords :
Ensemble Learning Entropy , Fuzzy C-Means , Distributed Clustering , Consensus Clustering
Journal title :
Astroparticle Physics
Serial Year :
2019
Record number :
2453202
Link To Document :
بازگشت