• DocumentCode
    728842
  • Title

    Distribution based ensemble for class imbalance learning

  • Author

    Mustafa, Ghulam ; Zhendong Niu ; Yousif, Abdallah ; Tarus, John

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
  • fYear
    2015
  • fDate
    20-22 May 2015
  • Firstpage
    5
  • Lastpage
    10
  • Abstract
    MultiBoost ensemble has been well acknowledged as an effective learning algorithm which able to reduce both bias and variance in error and has high generalization performance. However, to deal with the class imbalanced learning, the Multi- Boost shall be amended. In this paper, a new hybrid machine learning method called Distribution based MultiBoost (DBMB) for class imbalanced problems is proposed, which combines Distribution based balanced sampling with the MultiBoost algorithm to achieve better minority class performance. It minimizes the within class and between class imbalance by learning and sampling different distributions (Gaussian and Poisson) and reduces bias and variance in error by employing the MultiBoost ensemble. Therefore, DBMB could output the final strong learner that is more proficient ensemble of weak base learners for imbalanced data sets. We prove that the G-mean, F1 measure and AUC of the DBMB is significantly superior to others. The experimental verification has shown that the proposed DBMB outperforms other state-of-the-art algorithms on many real world class imbalanced problems. Furthermore, our proposed method is scalable as compare to other boosting methods.
  • Keywords
    Gaussian distribution; Poisson distribution; learning (artificial intelligence); sampling methods; AUC; DBMB; F1 measure; G-mean; Gaussian distributions; Poisson distributions; class imbalance learning; class imbalanced problems; distribution based balanced sampling; distribution based ensemble; distribution based multiBoost; error bias; error variance; hybrid machine learning method; learning algorithm; minority class performance; multiBoost algorithm; multiBoost ensemble; Accuracy; Boosting; Committees; Gaussian distribution; Machine learning algorithms; Standards; Training; Class imbalance learning; MultiBoost; distribution based resampling; ensemble learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovative Computing Technology (INTECH), 2015 Fifth International Conference on
  • Conference_Location
    Galcia
  • Type

    conf

  • DOI
    10.1109/INTECH.2015.7173365
  • Filename
    7173365