DocumentCode
3474012
Title
High dimensional gene expression data dimension reduction
Author
Chao, Shi ; Lihui, Chen
Author_Institution
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore
Volume
1
fYear
2004
fDate
1-3 Dec. 2004
Firstpage
451
Abstract
Gene expression data analysis is a new approach in cancer diagnosis. Feature selection is an important preprocessing step in gene expression data clustering. In this paper, we demonstrate the effectiveness of feature grouping approach in feature dimension reduction. In our proposed framework, large number of features is grouped to form several feature subsets. By criteria of clustering accuracy, one feature subset is chosen as the candidate subset for further processing by PCA or entropy ranking, and the final feature subset are formed by selecting the features from top ranked ones. Advantage of the framework is that it considers both subset and individual feature´s discrimination power, also it requires little information about the class label. A prototype of the proposed framework has been implemented and tested on the leukemia data set. The results have given positive support to the framework.
Keywords
biology computing; cancer; data analysis; data reduction; feature extraction; pattern clustering; principal component analysis; PCA; cancer diagnosis; data analysis; data clustering; feature selection; high dimensional gene expression data dimension reduction; leukemia data set; principal component analysis; Cancer; Clustering algorithms; DNA; Data analysis; Data engineering; Diseases; Gene expression; Genetics; Neoplasms; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Cybernetics and Intelligent Systems, 2004 IEEE Conference on
Print_ISBN
0-7803-8643-4
Type
conf
DOI
10.1109/ICCIS.2004.1460457
Filename
1460457
Link To Document