DocumentCode :
2600380
Title :
Segregation-based subspace clustering for huge dimensional data
Author :
Alsagabi, Majid I. ; Tewfik, Ahmed H.
Author_Institution :
Dept. of Electr. & Comput. Engi neering, Univ. of Minnesota, Minneapolis, MN, USA
fYear :
2010
fDate :
10-12 Nov. 2010
Firstpage :
1
Lastpage :
4
Abstract :
Clustering algorithms break down when the data points fall in huge-dimensional spaces. To tackle this problem, many subspace clustering methods were proposed to build up a subspace where data points cluster efficiently. The bottom-up approach is used widely to select a set of candidate features, and then to use a portion of this set to build up the hidden subspace step by step. The complexity depends exponentially or cubically on the number of the selected features. In this paper, we present SEGCLU, a SEGregation-based subspace CLUstering method which significantly reduces the size of the candidate features´ set and has a cubic complexity. This algorithm was applied at noise-free data of DNA copy numbers of two groups of autistic and typically developing children to extract a potential bio-marker for autism. 85% of the individuals were classified correctly in a 13-dimensional subspace.
Keywords :
DNA; bioinformatics; pattern clustering; 13-dimensional subspace; DNA copy numbers; SEGCLU; autism; autistic children; bottom-up approach; cubic complexity; hidden subspace; huge dimensional data; potential biomarker; segregation-based subspace clustering; typically developing children; Decision support systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Genomic Signal Processing and Statistics (GENSIPS), 2010 IEEE International Workshop on
Conference_Location :
Cold Spring Harbor, NY
ISSN :
2150-3001
Print_ISBN :
978-1-61284-791-7
Type :
conf
DOI :
10.1109/GENSIPS.2010.5719667
Filename :
5719667
Link To Document :
بازگشت