• DocumentCode
    2734715
  • Title

    Testing of clustering

  • Author

    Alon, Noga ; Dar, Seannie ; Parnas, Michal ; Ron, Dana

  • Author_Institution
    Dept. of Math., Tel Aviv Univ., Israel
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    240
  • Lastpage
    250
  • Abstract
    A set X of points in ℜd is (k,b)-clusterable if X can be partitioned into k subsets (clusters) so that the diameter (alternatively, the radius) of each cluster is at most b. We present algorithms that by sampling from a set X, distinguish between the case that X is (k,b)-clusterable and the case that X is ε-far from being (k,b´)-clusterable for any given 0<ε⩽1 and for b´⩾b. In ε-far from being (k,b´)-clusterable we mean that more than ε.|X| points should be removed from X so that it becomes (k,b´)-clusterable. We give algorithms for a variety of cost measures that use a sample of size independent of |X|, and polynomial in k and 1/ε. Our algorithms can also be used to find approximately good clusterings. Namely, these are clusterings of all but an ε-fraction of the points in X that have optimal (or close to optimal) cost. The benefit of our algorithms is that they construct an implicit representation of such clusterings in time independent of |X|. That is, without actually having to partition all points in X, the implicit representation can be used to answer queries concerning the cluster any given point belongs to
  • Keywords
    computational complexity; pattern clustering; statistical analysis; clustering testing; cost measures; lower bounds; optimal cost; sampling; Clustering algorithms; Cost function; Educational institutions; Mathematics; Partitioning algorithms; Performance evaluation; Sampling methods; Size measurement; Testing; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Foundations of Computer Science, 2000. Proceedings. 41st Annual Symposium on
  • Conference_Location
    Redondo Beach, CA
  • ISSN
    0272-5428
  • Print_ISBN
    0-7695-0850-2
  • Type

    conf

  • DOI
    10.1109/SFCS.2000.892111
  • Filename
    892111