• DocumentCode
    3599870
  • Title

    Building diverse and optimized ensembles of gradient boosted trees for high-dimensional data

  • Author

    Abdunabi, Tarek ; Basir, Otman

  • Author_Institution
    Electr. & Comput. Eng. Dept., Univ. of Waterloo, Waterloo, ON, Canada
  • fYear
    2014
  • Firstpage
    351
  • Lastpage
    356
  • Abstract
    Gradient Boosting Machines (GBMs) are powerful ensemble learning techniques that have been successfully applied to several low-dimensional applications. In GBMs, the learning algorithm sequentially fits new models to provide more accurate prediction of the response variable. Despite their high accuracy, GBMs suffer from major drawbacks such as high memory-consumption. In addition, given the fact that the learning algorithm is essentially sequential, it has problems with parallelization by design. Therefore, building optimized GBMs for high-dimensional applications requires powerful computations resources. In this paper, using real, high-dimensional (i.e. 1776 predictors) dataset, we demonstrate that by using different features selection/reduction techniques, the computations costs for building and tuning Tree-based GBMs can be substantially reduced at a slight drop in prediction accuracy. To cope with the data-intensive computations involved in building and tuning the ensembles, we utilize Amazon Elastic Compute Cloud (EC2) web service.
  • Keywords
    Web services; cloud computing; data reduction; feature selection; gradient methods; learning (artificial intelligence); trees (mathematics); Amazon Elastic Compute Cloud; EC2 Web service; GBM; data reduction; data-intensive computation; ensemble learning technique; feature selection; gradient boosted tree; gradient boosting machine; high-dimensional data; learning algorithm; Accuracy; Biology; Operating systems; Radio frequency; Sensitivity; Tuning; Cloud computing; Ensemble learning; High-dimensional data; Predictive modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing and Intelligence Systems (CCIS), 2014 IEEE 3rd International Conference on
  • Print_ISBN
    978-1-4799-4720-1
  • Type

    conf

  • DOI
    10.1109/CCIS.2014.7175758
  • Filename
    7175758