DocumentCode :
4365
Title :
Centroid-Based Actionable 3D Subspace Clustering
Author :
Sim, Kihong ; Ghim-Eng Yap ; Hardoon, D.R. ; Gopalkrishnan, Vivekanand ; Gao Cong ; Lukman, Suryani
Author_Institution :
Inst. for Infocomm Res., A*STAR, Singapore, Singapore
Volume :
25
Issue :
6
fYear :
2013
fDate :
Jun-13
Firstpage :
1213
Lastpage :
1226
Abstract :
Actionable 3D subspace clustering from real-world continuous-valued 3D (i.e., object-attribute-context) data promises tangible benefits such as discovery of biologically significant protein residues and profitable stocks, but existing algorithms are inadequate in solving this clustering problem; most of them are not actionable (ability to suggest profitable or beneficial actions to users), do not allow incorporation of domain knowledge, and are parameter sensitive, i.e., the wrong threshold setting reduces the cluster quality. Moreover, its 3D structure complicates this clustering problem. We propose a centroid-based actionable 3D subspace clustering framework, named CATSeeker, which allows incorporation of domain knowledge, and achieves parameter insensitivity and excellent performance through a unique combination of singular value decomposition, numerical optimization, and 3D frequent itemset mining. Experimental results on synthetic, protein structural, and financial data show that CATSeeker significantly outperforms all the competing methods in terms of efficiency, parameter insensitivity, and cluster usefulness.
Keywords :
data mining; optimisation; pattern clustering; singular value decomposition; solid modelling; 3D frequent itemset mining; 3D structure; CATSeeker; biologically significant protein residue discovery; centroid-based actionable 3D subspace clustering framework; centroid-based actionable 3D subspace clustering problem; cluster quality; cluster usefulness; domain knowledge incorporation; numerical optimization; parameter insensitivity; profitable stocks; real-world continuous-valued 3D; singular value decomposition; tangible benefits; Cats; Clustering algorithms; Data mining; Proteins; Tensile stress; Three dimensional displays; 3D subspace clustering; financial data mining; numerical optimization; protein structural and dynamics analysis; singular vector decomposition;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2012.37
Filename :
6152120
Link To Document :
بازگشت