• DocumentCode
    390903
  • Title

    A parameterless method for efficiently discovering clusters of arbitrary shape in large datasets

  • Author

    Foss, Andrew ; Zaïane, Osmar R.

  • Author_Institution
    Alberta Univ., Edmonton, Alta., Canada
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    179
  • Lastpage
    186
  • Abstract
    Clustering is the problem of grouping data based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity. The problem Of clustering data sets is also known as unsupervised classification, since no class labels are given. However, all existing clustering algorithms require some parameters to steer the clustering process, such as the famous k for the number of expected clusters, which constitutes a supervision of a sort. We present in this paper a new, efficient, fast and scalable clustering algorithm that clusters over a range of resolutions and finds a potential optimum clustering without requiring any parameter input. Our experiments show that our algorithm outperforms most existing clustering algorithms in quality and speed for large data sets.
  • Keywords
    data mining; minimisation; pattern clustering; arbitrarily shaped cluster discovery; clustering; efficient clustering algorithm; fast clustering algorithm; inter-group similarity minimization; intra-group similarity maximization; large datasets; parameterless method; scalable clustering algorithm; unsupervised classification; Clustering algorithms; Clustering methods; Gravity; Multi-stage noise shaping; Noise shaping; Partitioning algorithms; Scalability; Shape;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
  • Print_ISBN
    0-7695-1754-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2002.1183901
  • Filename
    1183901