Title :
Margin-Maximizing Feature Elimination Methods for Linear and Nonlinear Kernel-Based Discriminant Functions
Author :
Aksu, Yaman ; Miller, David J. ; Kesidis, George ; Yang, Qing X.
Author_Institution :
Electr. Eng. Dept., Pennsylvania State Univ., University Park, PA, USA
fDate :
5/1/2010 12:00:00 AM
Abstract :
Feature selection for classification in high-dimensional spaces can improve generalization, reduce classifier complexity, and identify important, discriminating feature ??markers.?? For support vector machine (SVM) classification, a widely used technique is recursive feature elimination (RFE). We demonstrate that RFE is not consistent with margin maximization, central to the SVM learning approach. We thus propose explicit margin-based feature elimination (MFE) for SVMs and demonstrate both improved margin and improved generalization, compared with RFE. Moreover, for the case of a nonlinear kernel, we show that RFE assumes that the squared weight vector 2-norm is strictly decreasing as features are eliminated. We demonstrate this is not true for the Gaussian kernel and, consequently, RFE may give poor results in this case. MFE for nonlinear kernels gives better margin and generalization. We also present an extension which achieves further margin gains, by optimizing only two degrees of freedom-the hyperplane´s intercept and its squared 2-norm-with the weight vector orientation fixed. We finally introduce an extension that allows margin slackness. We compare against several alternatives, including RFE and a linear programming method that embeds feature selection within the classifier design. On high-dimensional gene microarray data sets, University of California at Irvine (UCI) repository data sets, and Alzheimer´s disease brain image data, MFE methods give promising results.
Keywords :
Gaussian processes; feature extraction; generalisation (artificial intelligence); linear programming; pattern classification; recursive estimation; support vector machines; Gaussian kernel; SVM classification; SVM learning approach; classifier complexity reduction; feature selection; improved generalization; linear programming method; margin based feature elimination; margin maximization; margin maximizing feature elimination method; nonlinear kernel based discriminant function; recursive feature elimination; squared weight vector; support vector machine; Alzheimer´s; Gaussian kernel; classifier margin; discriminant function; feature elimination; magnetic resonance imaging (MRI); margin maximization; medical imaging; microarray; neurodegenerative; polynomial kernel; recursive feature elimination (RFE); support vector machine (SVM); Algorithms; Artificial Intelligence; Colonic Neoplasms; Discriminant Analysis; Discrimination (Psychology); Gene Expression Profiling; Generalization (Psychology); Humans; Linear Models; Nonlinear Dynamics; Oligonucleotide Array Sequence Analysis; Tumor Markers, Biological;
Journal_Title :
Neural Networks, IEEE Transactions on
DOI :
10.1109/TNN.2010.2041069