DocumentCode
2465677
Title
Improving Feature Subset Selection Using a Genetic Algorithm for Microarray Gene Expression Data
Author
Tan, Feng ; Fu, Xuezheng ; Zhang, Yanqing ; Bourgeois, Anu G.
Author_Institution
Georgia State Univ., Atlanta
fYear
0
fDate
0-0 0
Firstpage
2529
Lastpage
2534
Abstract
Microarray data usually contains a huge number of genes (features) and a comparatively small number of samples, which make accurate classification or prediction of diseases challenging. Feature selection techniques can help us identify important and irrelevant (unimportant) features by applying certain selection criteria. However, different feature selection algorithms based on various theoretical arguments often produce different results when applied to the same data set. This makes selecting an optimal or near optimal feature subset for a data set difficult. In this paper, we propose using a genetic algorithm to improve feature subset selection by combining valuable outcomes from multiple feature selection methods. The goal of our genetic algorithm is to achieve a balance between the classification accuracy and the size of the feature subsets selected. The advantages of this approach include the ability to accommodate different feature selection criteria and find small subsets of features that perform well for a particular inductive learning algorithm of interest to build the classifier. The experimental results demonstrate that our approach can find subsets of features with higher classification accuracy and/or smaller size compared with each individual feature selection algorithm.
Keywords
diseases; feature extraction; genetic algorithms; genetics; learning by example; medical computing; pattern classification; classification accuracy; disease prediction; feature subset selection; genetic algorithm; inductive learning algorithm; microarray gene expression data; near optimal feature subset; Computer science; Diseases; Diversity reception; Gene expression; Genetic algorithms; Mutual information; Pattern classification; Scalability; Statistical analysis; Statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Evolutionary Computation, 2006. CEC 2006. IEEE Congress on
Conference_Location
Vancouver, BC
Print_ISBN
0-7803-9487-9
Type
conf
DOI
10.1109/CEC.2006.1688623
Filename
1688623
Link To Document