• DocumentCode
    1440375
  • Title

    Fast Graph-Based Relaxed Clustering for Large Data Sets Using Minimal Enclosing Ball

  • Author

    Qian, Pengjiang ; Chung, Fu-lai ; Wang, Shitong ; Deng, Zhaohong

  • Author_Institution
    Sch. of Digital Media, Jiangnan Univ., Wuxi, China
  • Volume
    42
  • Issue
    3
  • fYear
    2012
  • fDate
    6/1/2012 12:00:00 AM
  • Firstpage
    672
  • Lastpage
    687
  • Abstract
    Although graph-based relaxed clustering (GRC) is one of the spectral clustering algorithms with straightforwardness and self-adaptability, it is sensitive to the parameters of the adopted similarity measure and also has high time complexity which severely weakens its usefulness for large data sets. In order to overcome these shortcomings, after introducing certain constraints for GRC, an enhanced version of GRC [constrained GRC (CGRC)] is proposed to increase the robustness of GRC to the parameters of the adopted similarity measure, and accordingly, a novel algorithm called fast GRC (FGRC) based on CGRC is developed in this paper by using the core-set-based minimal enclosing ball approximation. A distinctive advantage of FGRC is that its asymptotic time complexity is linear with the data set size . At the same time, FGRC also inherits the straightforwardness and self-adaptability from GRC, making the proposed FGRC a fast and effective clustering algorithm for large data sets. The advantages of FGRC are validated by various benchmarking and real data sets.
  • Keywords
    approximation theory; computational complexity; graph theory; pattern clustering; asymptotic time complexity; constrained GRC; core-set-based minimal enclosing ball approximation; fast graph-based relaxed clustering; large data set clustering; similarity measure; spectral clustering algorithm; Approximation methods; Clustering algorithms; Complexity theory; Eigenvalues and eigenfunctions; Kernel; Support vector machines; Symmetric matrices; Clustering; large data sets; minimal enclosing ball (MEB); time complexity; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Databases, Factual; Information Storage and Retrieval; Models, Theoretical; Pattern Recognition, Automated;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4419
  • Type

    jour

  • DOI
    10.1109/TSMCB.2011.2172604
  • Filename
    6145713