• DocumentCode
    88701
  • Title

    Feature Selection Based on Dependency Margin

  • Author

    Yong Liu ; Feng Tang ; Zhiyong Zeng

  • Author_Institution
    State Key Lab. of Ind. Control Technol., Zhejiang Univ., Hangzhou, China
  • Volume
    45
  • Issue
    6
  • fYear
    2015
  • fDate
    Jun-15
  • Firstpage
    1209
  • Lastpage
    1221
  • Abstract
    Feature selection tries to find a subset of feature from a larger feature pool and the selected subset can provide the same or even better performance compared with using the whole set. Feature selection is usually a critical preprocessing step for many machine-learning applications such as clustering and classification. In this paper, we focus on feature selection for supervised classification which targets at finding features that can best predict class labels. Traditional greedy search algorithms incrementally find features based on the relevance of candidate features and the class label. However, this may lead to suboptimal results when there are redundant features that may interfere with the selection. To solve this problem, we propose a subset selection algorithm that considers both the selected and remaining features´ relevances with the label. The intuition is that features, which do not have better alternatives from the feature set, should be selected first. We formulate the selection problem as maximizing the dependency margin which is measured by the difference between the selected feature set performance and the remaining feature set performance. Extensive experiments on various data sets show the superiority of the proposed approach against traditional algorithms.
  • Keywords
    feature selection; greedy algorithms; learning (artificial intelligence); pattern classification; search problems; set theory; dependency margin; feature pool; feature relevances; feature selection; feature set performance; greedy search algorithms; machine-learning applications; subset selection algorithm; supervised classification; Approximation algorithms; Bayes methods; Markov processes; Prediction algorithms; Redundancy; Search problems; Silicon; Conditionally independent; dependency margin; feature selection; forward greedy search; redundant feature;
  • fLanguage
    English
  • Journal_Title
    Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-2267
  • Type

    jour

  • DOI
    10.1109/TCYB.2014.2347372
  • Filename
    6912011