DocumentCode
1440375
Title
Fast Graph-Based Relaxed Clustering for Large Data Sets Using Minimal Enclosing Ball
Author
Qian, Pengjiang ; Chung, Fu-lai ; Wang, Shitong ; Deng, Zhaohong
Author_Institution
Sch. of Digital Media, Jiangnan Univ., Wuxi, China
Volume
42
Issue
3
fYear
2012
fDate
6/1/2012 12:00:00 AM
Firstpage
672
Lastpage
687
Abstract
Although graph-based relaxed clustering (GRC) is one of the spectral clustering algorithms with straightforwardness and self-adaptability, it is sensitive to the parameters of the adopted similarity measure and also has high time complexity which severely weakens its usefulness for large data sets. In order to overcome these shortcomings, after introducing certain constraints for GRC, an enhanced version of GRC [constrained GRC (CGRC)] is proposed to increase the robustness of GRC to the parameters of the adopted similarity measure, and accordingly, a novel algorithm called fast GRC (FGRC) based on CGRC is developed in this paper by using the core-set-based minimal enclosing ball approximation. A distinctive advantage of FGRC is that its asymptotic time complexity is linear with the data set size . At the same time, FGRC also inherits the straightforwardness and self-adaptability from GRC, making the proposed FGRC a fast and effective clustering algorithm for large data sets. The advantages of FGRC are validated by various benchmarking and real data sets.
Keywords
approximation theory; computational complexity; graph theory; pattern clustering; asymptotic time complexity; constrained GRC; core-set-based minimal enclosing ball approximation; fast graph-based relaxed clustering; large data set clustering; similarity measure; spectral clustering algorithm; Approximation methods; Clustering algorithms; Complexity theory; Eigenvalues and eigenfunctions; Kernel; Support vector machines; Symmetric matrices; Clustering; large data sets; minimal enclosing ball (MEB); time complexity; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Databases, Factual; Information Storage and Retrieval; Models, Theoretical; Pattern Recognition, Automated;
fLanguage
English
Journal_Title
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
1083-4419
Type
jour
DOI
10.1109/TSMCB.2011.2172604
Filename
6145713
Link To Document