Title :
A Double Pruning Scheme for Boosting Ensembles
Author :
Soto, Victor ; Garcia-Moratilla, Sergio ; Martinez-Munoz, Gonzalo ; Hernandez-Lobato, Daniel ; Suarez, Almudena
Author_Institution :
Escuela Politec. Super., Univ. Autonoma de Madrid, Cantoblanco, Spain
Abstract :
Ensemble learning consists of generating a collection of classifiers whose predictions are then combined to yield a single unified decision. Ensembles of complementary classifiers provide accurate and robust predictions, which are often better than the predictions of the individual classifiers in the ensemble. Nevertheless, ensembles also have some drawbacks: typically, all classifiers are queried to compute the final ensemble prediction. Therefore, all the classifiers need to be accessible to address potential queries. This entails larger storage requirements and slower predictions than a single classifier. Ensemble pruning techniques are useful to alleviate these drawbacks. Static pruning techniques reduce the ensemble size by selecting a sub-ensemble of classifiers from the original ensemble. In dynamic pruning, the querying process is halted when the partial ensemble prediction is sufficient to reach a stable final decision with a reasonable amount of confidence. In this paper, we present the results of a comprehensive analysis of static and dynamic pruning techniques applied to Adaboost ensembles. These ensemble pruning techniques are evaluated on a wide range of classification problems. From this analysis, one concludes that the combination of static and dynamic pruning techniques provides a notable reduction in the memory requirements and an improvement in the classification time without a significant loss of prediction accuracy.
Keywords :
learning (artificial intelligence); pattern classification; query processing; storage management; Adaboost ensembles; boosting ensembles; classification problems; classification time; complementary classifiers; double pruning scheme; dynamic pruning techniques; ensemble learning; ensemble prediction; ensemble pruning techniques; ensemble size; memory requirements; querying process; static pruning techniques; storage requirements; Accuracy; Boosting; Heuristic algorithms; Memory management; Prediction algorithms; Training; Vectors; Adaboost; double pruning; ensemble pruning; instance-based pruning; semi-definite programming;
Journal_Title :
Cybernetics, IEEE Transactions on
DOI :
10.1109/TCYB.2014.2313638