• DocumentCode
    4365
  • Title

    Centroid-Based Actionable 3D Subspace Clustering

  • Author

    Sim, Kihong ; Ghim-Eng Yap ; Hardoon, D.R. ; Gopalkrishnan, Vivekanand ; Gao Cong ; Lukman, Suryani

  • Author_Institution
    Inst. for Infocomm Res., A*STAR, Singapore, Singapore
  • Volume
    25
  • Issue
    6
  • fYear
    2013
  • fDate
    Jun-13
  • Firstpage
    1213
  • Lastpage
    1226
  • Abstract
    Actionable 3D subspace clustering from real-world continuous-valued 3D (i.e., object-attribute-context) data promises tangible benefits such as discovery of biologically significant protein residues and profitable stocks, but existing algorithms are inadequate in solving this clustering problem; most of them are not actionable (ability to suggest profitable or beneficial actions to users), do not allow incorporation of domain knowledge, and are parameter sensitive, i.e., the wrong threshold setting reduces the cluster quality. Moreover, its 3D structure complicates this clustering problem. We propose a centroid-based actionable 3D subspace clustering framework, named CATSeeker, which allows incorporation of domain knowledge, and achieves parameter insensitivity and excellent performance through a unique combination of singular value decomposition, numerical optimization, and 3D frequent itemset mining. Experimental results on synthetic, protein structural, and financial data show that CATSeeker significantly outperforms all the competing methods in terms of efficiency, parameter insensitivity, and cluster usefulness.
  • Keywords
    data mining; optimisation; pattern clustering; singular value decomposition; solid modelling; 3D frequent itemset mining; 3D structure; CATSeeker; biologically significant protein residue discovery; centroid-based actionable 3D subspace clustering framework; centroid-based actionable 3D subspace clustering problem; cluster quality; cluster usefulness; domain knowledge incorporation; numerical optimization; parameter insensitivity; profitable stocks; real-world continuous-valued 3D; singular value decomposition; tangible benefits; Cats; Clustering algorithms; Data mining; Proteins; Tensile stress; Three dimensional displays; 3D subspace clustering; financial data mining; numerical optimization; protein structural and dynamics analysis; singular vector decomposition;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.37
  • Filename
    6152120