DocumentCode :
1359074
Title :
PLS-Based Gene Selection and Identification of Tumor-Specific Genes
Author :
Ji, Guoli ; Yang, Zijiang ; You, Wenjie
Author_Institution :
Dept. of Autom., Xiamen Univ., Xiamen, China
Volume :
41
Issue :
6
fYear :
2011
Firstpage :
830
Lastpage :
841
Abstract :
In view of the characteristics of high-dimensional small sample, strong relevance, and high noise of the identification of tumor-specific genes on microarray, a novel partial least squares (PLS) based gene-selection method, which synthesizes genetic relatedness and is suitable for multicategory classification, is presented. Using the explanation difference of independent variables on dependent variable (class), we define three indicators for global gene selection, which takes into accounts the combined effects of all the genes and the correlation among the genes. Integrated with the linear kernel support vector classifier (SVC), the proposed method is tested by MIT acute myeloid leukemia/acute lymphoblastic leukemia (AML/ALL) and small round blue cell tumors (SRBCT) data sets. A subset of specific genes with small numbers and high identification are obtained. The results indicate that our proposed PLS-based method for tumor-specific genes selection is highly efficient. Compared to the literature, the selected specific genes from both two-category dataset AML/ALL and multicategory dataset SRBCT are credible. Further investigation shows that the proposed gene-selection method is robust. Overall, the proposed method can effectively solve feature-selection problem on high-dimensional small sample. At the same time, it has good performance for multicategory classification as well.
Keywords :
biology computing; diseases; genetics; least squares approximations; pattern classification; support vector machines; tumours; AML/ALL; MIT acute myeloid leukemia; PLS based gene-selection method; PLS-based gene selection; PLS-based method; SRBCT data sets; SVC; acute lymphoblastic leukemia; feature-selection problem; genetic relatedness; global gene selection; high-dimensional small sample; identification; linear kernel support vector classifier; microarray; multicategory classification; multicategory dataset SRBCT; partial least squares; small round blue cell tumors data sets; tumor-specific genes selection; Gene expression; Least squares methods; Mathematical model; Tumors; Gene selection; high-dimensional small samples; partial least squares (PLS); tumor-specific gene;
fLanguage :
English
Journal_Title :
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
Publisher :
ieee
ISSN :
1094-6977
Type :
jour
DOI :
10.1109/TSMCC.2010.2078503
Filename :
5607317
Link To Document :
بازگشت