• DocumentCode
    2334684
  • Title

    An agglomerative hierarchical clustering using partial maximum array and incremental similarity computation method

  • Author

    Jung, Sung Young ; Kim, Taek-Soo

  • Author_Institution
    Machine Intelligence Group, LG Electron. Inst. of Technol., Seoul, South Korea
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    265
  • Lastpage
    272
  • Abstract
    As the tractable amount of data grows in the computer science area, fast clustering algorithms are required, because traditional clustering algorithms are not feasible for very large and high-dimensional data. Many studies have been reported on the clustering of large databases, but most of them circumvent this problem by using an approximation method, resulting in the deterioration of accuracy. In this paper, we propose a new clustering algorithm by means of a partial maximum array, which can realize agglomerative hierarchical clustering with the same accuracy as the brute-force algorithm and has O(N2 ) time complexity. We also present an incremental method of similarity computation which substitutes a scalar calculation for the time-consuming calculation of vector similarity. Experimental results show that clustering becomes significantly fast for large and high-dimensional data
  • Keywords
    arrays; computational complexity; data mining; data structures; database theory; pattern clustering; very large databases; accuracy deterioration; agglomerative hierarchical clustering; approximation method; fast clustering algorithm; high-dimensional data; incremental similarity computation method; large data sets; large databases; partial maximum array; scalar calculation; time complexity; Approximation methods; Clustering algorithms; Clustering methods; Computational efficiency; Computer science; Databases; Iterative algorithms; Machine intelligence; Merging; Nearest neighbor searches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    0-7695-1119-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2001.989528
  • Filename
    989528