DocumentCode :
1708139
Title :
Stability Yields a PTAS for k-Median and k-Means Clustering
Author :
Awasthi, Pranjal ; Blum, Avrim ; Sheffet, Or
Author_Institution :
Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear :
2010
Firstpage :
309
Lastpage :
318
Abstract :
We consider fc-median clustering in finite metric spaces and fc-means clustering in Euclidean spaces, in the setting where k is part of the input (not a constant). For the fc-means problem, Ostrovsky et al. show that if the optimal (k - 1)-means clustering of the input is more expensive than the optimal fc-means clustering by a factor of 1/∈2, then one can achieve a (1 + f(∈))-approximation to the fc-means optimal in time polynomial in n and k by using a variant of Lloyd\´s algorithm. In this work we substantially improve this approximation guarantee. We show that given only the condition that the (k - 1)-means optimal is more expensive than the fc-means optimal by a factor 1 + α for some constant α > 0, we can obtain a PTAS. In particular, under this assumption, for any ∈ > 0 we achieve a (1 + ∈)-approximation to the fc-means optimal in time polynomial in n and k, and exponential in 1/e and 1/α. We thus decouple the strength of the assumption from the quality of the approximation ratio. We also give a PTAS for the fc-median problem in finite metrics under the analogous assumption as well. For fc-means, we in addition give a randomized algorithm with improved running time of no(1) (k log n)poly(1/∈,1/α) Our technique also obtains a PTAS under the assumption of Balcan et al. that all (1 + α) approximations are δ-close to a desired target clustering, in the case that all target clusters have size greater than δn and α > 0 is constant. Note that the motivation of Balcan et al. is that for many clustering problems, the objective function is only a proxy for the true goal of getting close to the target. From this perspective, our improvement is that for fc-means in Euclidean spaces we reduce the distance of the clustering found to the target from O(δ) to δ when all target clusters are large, and for fc-median we improve- - the "largeness" condition needed in to get exactly δ-close from O(δn) to δn. Our results are based on a new notion of clustering stability.
Keywords :
pattern clustering; stability; Euclidean spaces; Lloyd algorithm; PTAS; finite metric spaces; k-means clustering; k-median clustering; stability; Approximation algorithms; Approximation methods; Clustering algorithms; Extraterrestrial measurements; Optimized production technology; Polynomials;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on
Conference_Location :
Las Vegas, NV
ISSN :
0272-5428
Print_ISBN :
978-1-4244-8525-3
Type :
conf
DOI :
10.1109/FOCS.2010.36
Filename :
5671196
Link To Document :
بازگشت