DocumentCode
3570887
Title
Improving the random forest algorithm by randomly varying the size of the bootstrap samples
Author
Adnan, Md Nasim
Author_Institution
Sch. of Comput. & Math., Charles Strut Univ., Bathurst, NSW, Australia
fYear
2014
Firstpage
303
Lastpage
308
Abstract
The Random Forest algorithm generates quite diverse decision trees as the base classifiers by applying the Random Subspace algorithm on the bootstrap samples for high dimensional datasets. However, for low dimensional datasets the diversity among the trees falls sharply for the Random Forest algorithm. To increase the ensemble accuracy by inducing more diversity among the decision trees we take a different approach. In Random Forest, the size of the bootstrap files remains the same every time to generate a decision tree as the base classifier. We propose to vary the size of the bootstrap samples randomly within a predefined range in order to increase the forest accuracy. We conduct an elaborate experimentation on several different datasets from UCI Machine Learning Repository. The experimental results show the worthiness of our proposed technique.
Keywords
decision trees; learning (artificial intelligence); pattern classification; UCI machine learning repository; base classifiers; bootstrap files; bootstrap samples; decision trees; high dimensional datasets; low dimensional datasets; random forest algorithm; random subspace algorithm; Accuracy; Classification algorithms; Decision trees; Ionosphere; Prediction algorithms; Training; Vegetation; bootstrap samples; decision forest; decision tree; prediction accuracy; random forest;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration (IRI), 2014 IEEE 15th International Conference on
Type
conf
DOI
10.1109/IRI.2014.7051904
Filename
7051904
Link To Document