Title :
Forest pruning based on Tree-Node Order
Author :
Guo, Huaping ; Fan, Ming ; Ye, Yangdong
Author_Institution :
Sch. of Inf. Eng., ZhengZhou Univ., Zhengzhou, China
Abstract :
This paper proposes a forest pruning method called F-Pruning to improve the performance of ensembles based on decision trees. Instead of trimming each decision tree separately or/and selecting an optimal or sub-optimal subset of base classifiers to form an ensemble, F-Pruning takes a fixed number of trimmed or untrimmed decision trees as a forest (ensemble) and prunes branches directly from the forest to improve the ensemble accuracy. F-Pruning is a greedy algorithm, which uses the impurity measure and the number of examples in each node to determine the rank of the node, and prunes the node with lowest rank each time. In this way, F-Pruning achieves a fast forest pruning and reduces the size of final ensembles significantly. Our experiments show that, in comparison with ensembles built by combining trimmed or untrimmed decision trees, forests pruned by F-Pruning have better generalization capability in most of data sets. Additionally, our experiments show that executing F-Pruning on sub-forests selected by EPIC can also reduce the size of the final ensembles significantly and improve their classification accuracies slightly.
Keywords :
data mining; decision trees; greedy algorithms; learning (artificial intelligence); EPIC; F-Pruning; data mining; forest pruning method; greedy algorithm; machine learning; tree-node order; trimmed decision trees; untrimmed decision trees; Accuracy; Complexity theory; Decision trees; Impurities; Radio frequency; Training; Vegetation; Ensemble Selection; Forest Pruning; Tree-Node Order;
Conference_Titel :
Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-8727-1
DOI :
10.1109/CSAE.2011.5952636