• DocumentCode
    1942168
  • Title

    The Importance of Diversity in Neural Network Ensembles - An Empirical Investigation

  • Author

    Johansson, Ulf ; Löfström, Tuve ; Niklasson, Lars

  • Author_Institution
    Boras Univ., Boras
  • fYear
    2007
  • fDate
    12-17 Aug. 2007
  • Firstpage
    661
  • Lastpage
    666
  • Abstract
    When designing ensembles, it is almost an axiom that the base classifiers must be diverse in order for the ensemble to generalize well. Unfortunately, there is no clear definition of the key term diversity, leading to several diversity measures and many, more or less ad hoc, methods for diversity creation in ensembles. In addition, no specific diversity measure has shown to have a high correlation with test set accuracy. The purpose of this paper is to empirically evaluate ten different diversity measures, using neural network ensembles and 11 publicly available data sets. The main result is that all diversity measures evaluated, in this study too, show low or very low correlation with test set accuracy. Having said that, two measures; double fault and difficulty show slightly higher correlations compared to the other measures. The study furthermore shows that the correlation between accuracy measured on training or validation data and test set accuracy also is rather low. These results challenge ensemble design techniques where diversity is explicitly maximized or where ensemble accuracy on a hold-out set is used for optimization.
  • Keywords
    learning (artificial intelligence); neural nets; optimisation; machine learning; neural network ensemble; optimization; Artificial neural networks; Design optimization; Diversity methods; Diversity reception; Equations; Informatics; Neural networks; Predictive models; Support vector machines; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2007. IJCNN 2007. International Joint Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1379-9
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2007.4371035
  • Filename
    4371035