Title :
Parallel random forest with IPython cluster
Author :
Wasit Limprasert
Author_Institution :
Department of Computer Science, Faculty of Science and Technology, Thammasat University, Pathumthani, Thailand
Abstract :
Recently research studies require analytic tools capable to interpret patterns and find hidden knowledge from huge amount of data. Random Forest, an ensemble-tree classifier based on bagging method, is one of many well-known classifiers to find hidden model from data. The classifier has been applied to recognize various kind of data, e.g. human pose from depth images, plankton images and time-series pattern analysis. In this paper, an implementation of optimized parallel Random Forest has been designed and implemented on IPython, which is an interactive Python with parallelization functionalities and convenient to be deployed in most of computing platforms. The implementation shows 80% of CPU utilization when performing a training of 107 samples in 12hrs on EC2 cluster with 32 cores. This implementation shows capability to analyses large amount of data.
Keywords :
Decision support systems
Conference_Titel :
Computer Science and Engineering Conference (ICSEC), 2015 International
DOI :
10.1109/ICSEC.2015.7401404