Title :
Feature Selection Metric Using AUC Margin for Small Samples and Imbalanced Data Classification Problems
Author :
Alshawabkeh, Malak ; Aslam, Javed A. ; Dy, Jennifer ; Kaeli, David
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
Abstract :
Feature selection helps us to address problems possessing high dimensionality, retaining only those features that are most important for the classification task. However, traditional feature selection methods fail to account for imbalanced class distributions, leading to poor predictions for minority class samples. Recently, there has been a growing interest around the Area Under ROC curve (AUC) metric due to the fact that it can provide meaningful performance measures in the presence of imbalanced data. In this paper, we propose a new margin-based feature selection metric that defines the quality of a set of features by considering the maximized AUC margin it induces during the process of learning with boosting. Our algorithm measures the cumulative effect each feature has on the margin distribution associated with the weighted linear combination that boosting produces over the positive and the negative examples. Experiments on various real imbalanced data sets show the effectiveness of our algorithm when faced with selecting informative features from small data possessing skewed class distributions.
Keywords :
data handling; learning (artificial intelligence); pattern classification; statistical distributions; AUC metric; area under ROC curve; data possessing skewed class distributions; feature selection methods; imbalanced class distributions; imbalanced data classification problems; margin distribution; margin-based feature selection metric; maximized AUC margin; minority class samples; performance measures; real imbalanced data sets; weighted linear combination; Accuracy; Boosting; Cancer; Measurement; Support vector machines; Training; area under the ROC curve (AUC); boosting; feature election; margin;
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
DOI :
10.1109/ICMLA.2011.70