DocumentCode
2961779
Title
The problem with ranking ensembles based on training or validation performance
Author
Johansson, Ulf ; Löfström, Tuve ; Boström, Henrik
Author_Institution
Sch. of Bus. & Inf., Univ. of Boras, Boras
fYear
2008
fDate
1-8 June 2008
Firstpage
3222
Lastpage
3228
Abstract
The main purpose of this study was to determine whether it is possible to somehow use results on training or validation data to estimate ensemble performance on novel data. With the specific setup evaluated; i.e. using ensembles built from a pool of independently trained neural networks and targeting diversity only implicitly, the answer is a resounding no. Experimentation, using 13 UCI datasets, shows that there is in general nothing to gain in performance on novel data by choosing an ensemble based on any of the training measures evaluated here. This is despite the fact that the measures evaluated include all the most frequently used; i.e. ensemble training and validation accuracy, base classifier training and validation accuracy, ensemble training and validation AUC and two diversity measures. The main reason is that all ensembles tend to have quite similar performance, unless we deliberately lower the accuracy of the base classifiers. The key consequence is, of course, that a data miner can do no better than picking an ensemble at random. In addition, the results indicate that it is futile to look for an algorithm aimed at optimizing ensemble performance by somehow selecting a subset of available base classifiers.
Keywords
data mining; learning (artificial intelligence); pattern classification; UCI dataset; base classifier training; data miner; ensemble training; independently trained neural network; ranking ensemble; Artificial neural networks; Diversity reception; Equations; Gain measurement; Informatics; Neural networks; Performance gain; Predictive models; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location
Hong Kong
ISSN
1098-7576
Print_ISBN
978-1-4244-1820-6
Electronic_ISBN
1098-7576
Type
conf
DOI
10.1109/IJCNN.2008.4634255
Filename
4634255
Link To Document