Title :
Using all data to generate decision tree ensembles
Author :
Martínez-Muñoz, Gonzalo ; Suárez, Alberto
Author_Institution :
Comput. Sci. Dept., Univ. Autonoma de Madrid, Spain
Abstract :
This paper develops a new method to generate ensembles of classifiers that uses all available data to construct every individual classifier. The base algorithm builds a decision tree in an iterative manner: The training data are divided into two subsets. In each iteration, one subset is used to grow the decision tree, starting from the decision tree produced by the previous iteration. This fully grown tree is then pruned by using the other subset. The roles of the data subsets are interchanged in every iteration. This process converges to a final tree that is stable with respect to the combined growing and pruning steps. To generate a variety of classifiers for the ensemble, we randomly create the subsets needed by the iterative tree construction algorithm. The method exhibits good performance in several standard datasets at low computational cost.
Keywords :
decision trees; error statistics; learning (artificial intelligence); pattern classification; bagging; base algorithm; classification ensembles; data subsets; decision trees; iterative tree construction algorithm; pattern recognition; Bagging; Classification tree analysis; Computational efficiency; Decision trees; Helium; Iterative algorithms; Pattern recognition; Supervised learning; Training data; Voting;
Journal_Title :
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
DOI :
10.1109/TSMCC.2004.833295