• DocumentCode
    2600380
  • Title

    Segregation-based subspace clustering for huge dimensional data

  • Author

    Alsagabi, Majid I. ; Tewfik, Ahmed H.

  • Author_Institution
    Dept. of Electr. & Comput. Engi neering, Univ. of Minnesota, Minneapolis, MN, USA
  • fYear
    2010
  • fDate
    10-12 Nov. 2010
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Clustering algorithms break down when the data points fall in huge-dimensional spaces. To tackle this problem, many subspace clustering methods were proposed to build up a subspace where data points cluster efficiently. The bottom-up approach is used widely to select a set of candidate features, and then to use a portion of this set to build up the hidden subspace step by step. The complexity depends exponentially or cubically on the number of the selected features. In this paper, we present SEGCLU, a SEGregation-based subspace CLUstering method which significantly reduces the size of the candidate features´ set and has a cubic complexity. This algorithm was applied at noise-free data of DNA copy numbers of two groups of autistic and typically developing children to extract a potential bio-marker for autism. 85% of the individuals were classified correctly in a 13-dimensional subspace.
  • Keywords
    DNA; bioinformatics; pattern clustering; 13-dimensional subspace; DNA copy numbers; SEGCLU; autism; autistic children; bottom-up approach; cubic complexity; hidden subspace; huge dimensional data; potential biomarker; segregation-based subspace clustering; typically developing children; Decision support systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Genomic Signal Processing and Statistics (GENSIPS), 2010 IEEE International Workshop on
  • Conference_Location
    Cold Spring Harbor, NY
  • ISSN
    2150-3001
  • Print_ISBN
    978-1-61284-791-7
  • Type

    conf

  • DOI
    10.1109/GENSIPS.2010.5719667
  • Filename
    5719667