Title :
Approximation algorithm for correlation clustering
Author :
Mitra, Pinaki ; Samal, Mamata
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Guwahati, Guwahati, India
Abstract :
The correlation clustering problem has been introduced recently as a model for clustering data when a binary relationship between data points is known. Correlation clustering is a graph-theoretic based clustering. More precisely, in a graph G = (V,E) vertices of the graph denote the number of items, where each edge of the graph labeled as either + or - depending upon the similarity or dissimilarity of its end points. An optimal clustering is measured in terms of (i) Maximizing the number of similarity inside the cluster plus the number of dissimilarity between the clusters (MAXAGREE). Or (ii) Minimizing the dissimilarity inside the cluster plus the number of similarity between the clusters(MINDISAGREE). These two optimization problems are NP-complete for k ges 2. We study the problem of finding clustering in terms of maximizing the number of similarity inside the cluster plus the number of dissimilarity between the clusters (MAXAGREE). Giotis and Guruswami obtained a polynomial time factor 0.87856 approximation algorithm for MAXAGREE on general graphs, by using a random hyperplane rounding of SDP relaxation of MAXAGREE. In this paper we present a polynomial time approximation algorithm with the improved approximation ratio 3/25 + 5radic(5) = 0.884458 for MAXAGREE, by using the linear combination of vector rounding and the Gegenbauer polynomials rounding. This result is also applicable for weighted 2-cluster editing problem, since SDP formulation of weighted 2-cluster editing problem is same as MAXAGREE.
Keywords :
computational complexity; graph theory; pattern clustering; polynomial approximation; Gegenbauer polynomials rounding; MAXAGREE; MINDISAGREE; NP-complete problem; correlation clustering problem; data clustering; graph-theoretic based clustering; optimization problem; polynomial time factor approximation algorithm; random hyperplane rounding; weighted 2-cluster editing problem; Approximation algorithms; Clustering algorithms; Computational biology; Computer science; Data engineering; Design optimization; Polynomials; Symmetric matrices; Time factors; Vectors;
Conference_Titel :
Networked Digital Technologies, 2009. NDT '09. First International Conference on
Conference_Location :
Ostrava
Print_ISBN :
978-1-4244-4614-8
Electronic_ISBN :
978-1-4244-4615-5
DOI :
10.1109/NDT.2009.5272169