Title :
Feature selection for microarray data by AUC analysis
Author :
Canul-Reich, Juana ; Hall, Lawrence O. ; Goldgof, Dmitry ; Eschrich, Steven A.
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL
Abstract :
Microarray datasets are often limited to a small number of samples with a large number of gene expressions. Therefore, dimensionality reduction through a feature/gene selection process is highly important for classification purposes. In this paper, a feature perturbation method we previously introduced is applied to do gene selection from microarray data. A publicly available colon cancer dataset is used in our experiments. In comparison with SVM-RFE, our method is better with feature sets of between 10 and 80, however for less than 10 features SVM-RFE results in higher accuracy. An analysis of the area under the curve of the feature perturbation method for the top 50 and 25 features is performed, aiming to determine the proper amount of noise to be applied. We show that a good set of small features/genes can be found using the feature perturbation method.
Keywords :
biology computing; cancer; genetics; pattern classification; colon cancer dataset; dimensionality reduction; feature perturbation method; feature selection; feature/gene selection process; gene expressions; microarray datasets; Biomedical engineering; Cancer; Colon; Computer science; Data analysis; Gene expression; Noise level; Perturbation methods; Support vector machine classification; Support vector machines; Microarray data; classification; feature selection; support vector machines;
Conference_Titel :
Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-2383-5
Electronic_ISBN :
1062-922X
DOI :
10.1109/ICSMC.2008.4811371