DocumentCode
3425751
Title
Decision trees as information source for attribute selection
Author
Fukuda, Kyoko ; Martin, Brent
Author_Institution
Dept. of Math. & Stat., Univ. of Canterbury, Christchurch
fYear
2009
fDate
March 30 2009-April 2 2009
Firstpage
101
Lastpage
108
Abstract
Attribute selection (AS) is known to help improve the results of algorithmic learning processes by selecting fewer, but predictive, input attributes. This study introduces a new ranking filter AS method, the tree node selection (TNS) method. The idea of TNS is to determine significant but fewer attributes by searching through the pre-generated decision tree as information source in the manner of a pruning process. To test the performance of TNS, 33 benchmark datasets (UCI) with various numbers of instances, attributes and classes were investigated along with five known AS methods, and the results were tested with the C4.5 (unpruned) and naive Bayes classifiers. The performance, in terms of classification accuracy improvement, reduction in the number of attributes and the size of the generated decision tree are assessed by various statistical analyses for multiple comparisons. TNS was found to provide the most consistent performance for C4.5 and naive Bayes classifiers, and generated unpruned C4.5 trees with selected fewer attributes were generally found to achieve similar quality to pruned C4.5 trees without any attribute selection.
Keywords
Bayes methods; data handling; decision trees; algorithmic learning processes; attribute selection; decision trees; information source; naive Bayes classifiers; pruned C4.5 trees; pruning process; ranking filter AS method; tree node selection; unpruned C4.5 trees; Air pollution; Atmospheric measurements; Benchmark testing; Classification algorithms; Classification tree analysis; Decision trees; Filters; Gain measurement; Pollution measurement; Statistical analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
Conference_Location
Nashville, TN
Print_ISBN
978-1-4244-2765-9
Type
conf
DOI
10.1109/CIDM.2009.4938636
Filename
4938636
Link To Document