مرکز منطقه ای اطلاع رساني علوم و فناوري - Segregation-based subspace clustering for huge dimensional data

DocumentCode :

2600380

Title :

Segregation-based subspace clustering for huge dimensional data

Author :

Alsagabi, Majid I. ; Tewfik, Ahmed H.

Author_Institution :

Dept. of Electr. & Comput. Engi neering, Univ. of Minnesota, Minneapolis, MN, USA

fYear :

2010

fDate :

10-12 Nov. 2010

Firstpage :

Lastpage :

Abstract :

Clustering algorithms break down when the data points fall in huge-dimensional spaces. To tackle this problem, many subspace clustering methods were proposed to build up a subspace where data points cluster efficiently. The bottom-up approach is used widely to select a set of candidate features, and then to use a portion of this set to build up the hidden subspace step by step. The complexity depends exponentially or cubically on the number of the selected features. In this paper, we present SEGCLU, a SEGregation-based subspace CLUstering method which significantly reduces the size of the candidate features´ set and has a cubic complexity. This algorithm was applied at noise-free data of DNA copy numbers of two groups of autistic and typically developing children to extract a potential bio-marker for autism. 85% of the individuals were classified correctly in a 13-dimensional subspace.

Keywords :

DNA; bioinformatics; pattern clustering; 13-dimensional subspace; DNA copy numbers; SEGCLU; autism; autistic children; bottom-up approach; cubic complexity; hidden subspace; huge dimensional data; potential biomarker; segregation-based subspace clustering; typically developing children; Decision support systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Genomic Signal Processing and Statistics (GENSIPS), 2010 IEEE International Workshop on

Conference_Location :

Cold Spring Harbor, NY

ISSN :

2150-3001

Print_ISBN :

978-1-61284-791-7

Type :

conf

DOI :

10.1109/GENSIPS.2010.5719667

Filename :

5719667

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2600380