Abstract :
Class prediction and feature selection are two learning tasks that are strictly paired in the search of molecular profiles from microarray data [Cesare, Maria, Stefano, Giuseppe, "Semisupervised Learning for Molecular Profiling," IEEE/ACM Trans, on Computational Biology and Bioinformatics, vol. 02, no. 2, pp. 110-118, 2005.]. In this paper, we present a scheme of recursive feature addition for gene selection combining classifiers for the purpose of classifying tumor tissues using DNA microarray data. Based on the highest train accuracy, the next gene is added into the feature set according to the measures of the correlation / mutual information between chosen genes and candidate genes. In comparison with the well-known gene selection methods of T-TEST and SVM-RFE using different classifiers, our method, on the average, performs the best regarding the classification accuracy under different feature dimensions.
Keywords :
DNA; medical computing; recursive estimation; DNA microarray data; SVM-RFE; T-TEST; class prediction; feature selection; gene selection; molecular profiles search; recursive feature addition; tumor tissue classification; Cancer; Computer science; DNA; Electronic mail; Gene expression; Machine learning; Machine learning algorithms; Mutual information; Neoplasms; Testing;