• DocumentCode
    2961779
  • Title

    The problem with ranking ensembles based on training or validation performance

  • Author

    Johansson, Ulf ; Löfström, Tuve ; Boström, Henrik

  • Author_Institution
    Sch. of Bus. & Inf., Univ. of Boras, Boras
  • fYear
    2008
  • fDate
    1-8 June 2008
  • Firstpage
    3222
  • Lastpage
    3228
  • Abstract
    The main purpose of this study was to determine whether it is possible to somehow use results on training or validation data to estimate ensemble performance on novel data. With the specific setup evaluated; i.e. using ensembles built from a pool of independently trained neural networks and targeting diversity only implicitly, the answer is a resounding no. Experimentation, using 13 UCI datasets, shows that there is in general nothing to gain in performance on novel data by choosing an ensemble based on any of the training measures evaluated here. This is despite the fact that the measures evaluated include all the most frequently used; i.e. ensemble training and validation accuracy, base classifier training and validation accuracy, ensemble training and validation AUC and two diversity measures. The main reason is that all ensembles tend to have quite similar performance, unless we deliberately lower the accuracy of the base classifiers. The key consequence is, of course, that a data miner can do no better than picking an ensemble at random. In addition, the results indicate that it is futile to look for an algorithm aimed at optimizing ensemble performance by somehow selecting a subset of available base classifiers.
  • Keywords
    data mining; learning (artificial intelligence); pattern classification; UCI dataset; base classifier training; data miner; ensemble training; independently trained neural network; ranking ensemble; Artificial neural networks; Diversity reception; Equations; Gain measurement; Informatics; Neural networks; Performance gain; Predictive models; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1820-6
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2008.4634255
  • Filename
    4634255