• DocumentCode
    3570887
  • Title

    Improving the random forest algorithm by randomly varying the size of the bootstrap samples

  • Author

    Adnan, Md Nasim

  • Author_Institution
    Sch. of Comput. & Math., Charles Strut Univ., Bathurst, NSW, Australia
  • fYear
    2014
  • Firstpage
    303
  • Lastpage
    308
  • Abstract
    The Random Forest algorithm generates quite diverse decision trees as the base classifiers by applying the Random Subspace algorithm on the bootstrap samples for high dimensional datasets. However, for low dimensional datasets the diversity among the trees falls sharply for the Random Forest algorithm. To increase the ensemble accuracy by inducing more diversity among the decision trees we take a different approach. In Random Forest, the size of the bootstrap files remains the same every time to generate a decision tree as the base classifier. We propose to vary the size of the bootstrap samples randomly within a predefined range in order to increase the forest accuracy. We conduct an elaborate experimentation on several different datasets from UCI Machine Learning Repository. The experimental results show the worthiness of our proposed technique.
  • Keywords
    decision trees; learning (artificial intelligence); pattern classification; UCI machine learning repository; base classifiers; bootstrap files; bootstrap samples; decision trees; high dimensional datasets; low dimensional datasets; random forest algorithm; random subspace algorithm; Accuracy; Classification algorithms; Decision trees; Ionosphere; Prediction algorithms; Training; Vegetation; bootstrap samples; decision forest; decision tree; prediction accuracy; random forest;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration (IRI), 2014 IEEE 15th International Conference on
  • Type

    conf

  • DOI
    10.1109/IRI.2014.7051904
  • Filename
    7051904