Title :
Feature selection via decision tree surrogate splits
Author :
Springer, Clayton ; Kegelmeyer, W. Philip
Author_Institution :
Biosystems Res. Dept., Sandia Nat. Labs., Livermore, CA
Abstract :
CARTpsilas ldquovariable rankingrdquo provides a quick estimate of the importance of an individual feature in a decision tree, and it is based on surrogate splits. We extend this estimate to arbitrary subsets. We have applied our estimate (called ldquodIrdquo) to three datasets. The performance of dI as an importance estimate is very dependent on the underlying performance of the tree used to generate the surrogate splits.
Keywords :
decision trees; importance sampling; CART variable ranking; decision tree surrogate splits; feature selection; importance estimation; Biomedical measurements; Decision trees; Entropy; Genetic algorithms; Impurities; Laboratories; Phase locked loops; Training data;
Conference_Titel :
Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
Conference_Location :
Tampa, FL
Print_ISBN :
978-1-4244-2174-9
Electronic_ISBN :
1051-4651
DOI :
10.1109/ICPR.2008.4761257