• DocumentCode
    2734872
  • Title

    Polynomial time approximation schemes for geometric k-clustering

  • Author

    Ostrovsky, Rafail ; Rabani, Yuval

  • Author_Institution
    Telcordia Technol., Morristown, NJ, USA
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    349
  • Lastpage
    358
  • Abstract
    We deal with the problem of clustering data points. Given n points in a larger set (for example, Rd) endowed with a distance function (for example, L2 distance), we would like to partition the data set into k disjoint clusters, each with a “cluster center”, so as to minimize the sum over all data points of the distance between the point and the center of the cluster containing the point. The problem is provably NP-hard in some high dimensional geometric settings, even for k=2. We give polynomial time approximation schemes for this problem in several settings, including the binary cube (0, 1)d with Hamming distance, and Rd either with L1 distance, or with L2 distance, or with the square of L2 distance. In all these settings, the best previous results were constant factor approximation guarantees. We note that our problem is similar in flavor to the k-median problem (and the related facility location problem), which has been considered in graph-theoretic and fixed dimensional geometric settings, where it becomes hard when k is part of the input. In contrast, we study the problem when k is fixed, but the dimension is part of the input. Our algorithms are based on a dimension reduction construction for the Hamming cube, which may be of independent interest
  • Keywords
    computational complexity; computational geometry; pattern clustering; Hamming distance; NP-hard problem; binary cube; data point clustering; data set partitioning; distance function; geometric k-clustering; high dimensional geometry; k-median problem; polynomial time approximation schemes; Cities and towns; Computational biology; Construction industry; Contracts; Euclidean distance; Hamming distance; Operations research; Partitioning algorithms; Polynomials; Uniform resource locators;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Foundations of Computer Science, 2000. Proceedings. 41st Annual Symposium on
  • Conference_Location
    Redondo Beach, CA
  • ISSN
    0272-5428
  • Print_ISBN
    0-7695-0850-2
  • Type

    conf

  • DOI
    10.1109/SFCS.2000.892123
  • Filename
    892123