• DocumentCode
    1496416
  • Title

    Clustering Uncertain Data Using Voronoi Diagrams and R-Tree Index

  • Author

    Ben Kao ; Lee, Sau Dan ; Lee, Foris K F ; Cheung, David Wai-Lok ; Ho, Wai-Shing

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
  • Volume
    22
  • Issue
    9
  • fYear
    2010
  • Firstpage
    1219
  • Lastpage
    1233
  • Abstract
    Abstract-We study the problem of clustering uncertain objects whose locations are described by probability density functions (pdfs). We show that the UK-means algorithm, which generalizes the k-means algorithm to handle uncertain objects, is very inefficient. The inefficiency comes from the fact that UK-means computes expected distances (EDs) between objects and cluster representatives. For arbitrary pdfs, expected distances are computed by numerical integrations, which are costly operations. We propose pruning techniques that are based on Voronoi diagrams to reduce the number of expected distance calculations. These techniques are analytically proven to be more effective than the basic bounding-box-based technique previously known in the literature. We then introduce an R-tree index to organize the uncertain objects so as to reduce pruning overheads. We conduct experiments to evaluate the effectiveness of our novel techniques. We show that our techniques are additive and, when used in combination, significantly outperform previously known methods.
  • Keywords
    computational geometry; pattern clustering; R-tree index; Voronoi diagrams; clustering; expected distances; probability density functions; uncertain data; Uncertainty; clustering; indexing methods.; object hierarchies;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.82
  • Filename
    5467074