• DocumentCode
    2774420
  • Title

    Feature selection via regularized trees

  • Author

    Houtao Deng ; Runger, G.

  • Author_Institution
    Intuit, Mountain View, CA, USA
  • fYear
    2012
  • fDate
    10-15 June 2012
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    We propose a tree regularization framework, which enables many tree models to perform feature selection efficiently. The key idea of the regularization framework is to penalize selecting a new feature for splitting when its gain (e.g. information gain) is similar to the features used in previous splits. The regularization framework is applied on random forest and boosted trees here, and can be easily applied to other tree models. Experimental studies show that the regularized trees can select high-quality feature subsets with regard to both strong and weak classifiers. Because tree models can naturally deal with categorical and numerical variables, missing values, different scales between variables, interactions and nonlinearities etc., the tree regularization framework provides an effective and efficient feature selection solution for many practical problems.
  • Keywords
    pattern classification; trees (mathematics); boosted trees; categorical variables; feature selection; high-quality feature subsets; interactions; missing values; nonlinearities; numerical variables; random forest; strong classifiers; tree models; tree regularization framework; variables scale; weak classifiers; Accuracy; Decision trees; Loss measurement; Radio frequency; Redundancy; Training; Vegetation; RBoost; RRF; regularized boosted trees; regularized random forest; tree regularization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), The 2012 International Joint Conference on
  • Conference_Location
    Brisbane, QLD
  • ISSN
    2161-4393
  • Print_ISBN
    978-1-4673-1488-6
  • Electronic_ISBN
    2161-4393
  • Type

    conf

  • DOI
    10.1109/IJCNN.2012.6252640
  • Filename
    6252640