Title of article :
On voting-based consensus of cluster ensembles
Author/Authors :
Ayad، نويسنده , , Hanan G. and Kamel، نويسنده , , Mohamed S.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
11
From page :
1943
To page :
1953
Abstract :
Voting-based consensus clustering refers to a distinct class of consensus methods in which the cluster label mismatch problem is explicitly addressed. The voting problem is defined as the problem of finding the optimal relabeling of a given partition with respect to a reference partition. It is commonly formulated as a weighted bipartite matching problem. In this paper, we present a more general formulation of the voting problem as a regression problem with multiple-response and multiple-input variables. We show that a recently introduced cumulative voting scheme is a special case corresponding to a linear regression method. We use a randomized ensemble generation technique, where an overproduced number of clusters is randomly selected for each ensemble partition. We apply an information theoretic algorithm for extracting the consensus clustering from the aggregated ensemble representation and for estimating the number of clusters. We apply it in conjunction with bipartite matching and cumulative voting. We present empirical evidence showing substantial improvements in clustering accuracy, stability, and estimation of the true number of clusters based on cumulative voting. The improvements are achieved in comparison to consensus algorithms based on bipartite matching, which perform very poorly with the chosen ensemble generation technique, and also to other recent consensus algorithms.
Keywords :
Cluster ensembles , Voting-based consensus , Clustering
Journal title :
PATTERN RECOGNITION
Serial Year :
2010
Journal title :
PATTERN RECOGNITION
Record number :
1733497
Link To Document :
بازگشت