• DocumentCode
    2219709
  • Title

    One tree to explain them all

  • Author

    Johansson, Ulf ; Sönströd, Cecilia ; Löfström, Tuve

  • Author_Institution
    Sch. of Bus. & Inf., Univ. of Boras, Boras, Sweden
  • fYear
    2011
  • fDate
    5-8 June 2011
  • Firstpage
    1444
  • Lastpage
    1451
  • Abstract
    Random forest is an often used ensemble technique, renowned for its high predictive performance. Random forests models are, however, due to their sheer complexity inherently opaque, making human interpretation and analysis impossible. This paper presents a method of approximating the random forest with just one decision tree. The approach uses oracle coaching, a recently suggested technique where a weaker but transparent model is generated using combinations of regular training data and test data initially labeled by a strong classifier, called the oracle. In this study, the random forest plays the part of the oracle, while the transparent models are decision trees generated by either the standard tree inducer J48, or by evolving genetic programs. Evaluation on 30 data sets from the UCI repository shows that oracle coaching significantly improves both accuracy and area under ROC curve, compared to using training data only. As a matter of fact, resulting single tree models are as accurate as the random forest, on the specific test instances. Most importantly, this is not achieved by inducing or evolving huge trees having perfect fidelity; a large majority of all trees are instead rather compact and clearly comprehensible. The experiments also show that the evolution outperformed J48, with regard to accuracy, but that this came at the expense of slightly larger trees.
  • Keywords
    decision trees; learning (artificial intelligence); pattern classification; ROC curve; UCI; decision tree; ensemble technique; genetic program; human interpretation; oracle coaching; random forest; regular training data; Accuracy; Data models; Decision trees; Predictive models; Training; Training data; Vegetation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation (CEC), 2011 IEEE Congress on
  • Conference_Location
    New Orleans, LA
  • ISSN
    Pending
  • Print_ISBN
    978-1-4244-7834-7
  • Type

    conf

  • DOI
    10.1109/CEC.2011.5949785
  • Filename
    5949785