• DocumentCode
    3425751
  • Title

    Decision trees as information source for attribute selection

  • Author

    Fukuda, Kyoko ; Martin, Brent

  • Author_Institution
    Dept. of Math. & Stat., Univ. of Canterbury, Christchurch
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    101
  • Lastpage
    108
  • Abstract
    Attribute selection (AS) is known to help improve the results of algorithmic learning processes by selecting fewer, but predictive, input attributes. This study introduces a new ranking filter AS method, the tree node selection (TNS) method. The idea of TNS is to determine significant but fewer attributes by searching through the pre-generated decision tree as information source in the manner of a pruning process. To test the performance of TNS, 33 benchmark datasets (UCI) with various numbers of instances, attributes and classes were investigated along with five known AS methods, and the results were tested with the C4.5 (unpruned) and naive Bayes classifiers. The performance, in terms of classification accuracy improvement, reduction in the number of attributes and the size of the generated decision tree are assessed by various statistical analyses for multiple comparisons. TNS was found to provide the most consistent performance for C4.5 and naive Bayes classifiers, and generated unpruned C4.5 trees with selected fewer attributes were generally found to achieve similar quality to pruned C4.5 trees without any attribute selection.
  • Keywords
    Bayes methods; data handling; decision trees; algorithmic learning processes; attribute selection; decision trees; information source; naive Bayes classifiers; pruned C4.5 trees; pruning process; ranking filter AS method; tree node selection; unpruned C4.5 trees; Air pollution; Atmospheric measurements; Benchmark testing; Classification algorithms; Classification tree analysis; Decision trees; Filters; Gain measurement; Pollution measurement; Statistical analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2765-9
  • Type

    conf

  • DOI
    10.1109/CIDM.2009.4938636
  • Filename
    4938636