Title :
Distribution based ensemble for class imbalance learning
Author :
Mustafa, Ghulam ; Zhendong Niu ; Yousif, Abdallah ; Tarus, John
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
Abstract :
MultiBoost ensemble has been well acknowledged as an effective learning algorithm which able to reduce both bias and variance in error and has high generalization performance. However, to deal with the class imbalanced learning, the Multi- Boost shall be amended. In this paper, a new hybrid machine learning method called Distribution based MultiBoost (DBMB) for class imbalanced problems is proposed, which combines Distribution based balanced sampling with the MultiBoost algorithm to achieve better minority class performance. It minimizes the within class and between class imbalance by learning and sampling different distributions (Gaussian and Poisson) and reduces bias and variance in error by employing the MultiBoost ensemble. Therefore, DBMB could output the final strong learner that is more proficient ensemble of weak base learners for imbalanced data sets. We prove that the G-mean, F1 measure and AUC of the DBMB is significantly superior to others. The experimental verification has shown that the proposed DBMB outperforms other state-of-the-art algorithms on many real world class imbalanced problems. Furthermore, our proposed method is scalable as compare to other boosting methods.
Keywords :
Gaussian distribution; Poisson distribution; learning (artificial intelligence); sampling methods; AUC; DBMB; F1 measure; G-mean; Gaussian distributions; Poisson distributions; class imbalance learning; class imbalanced problems; distribution based balanced sampling; distribution based ensemble; distribution based multiBoost; error bias; error variance; hybrid machine learning method; learning algorithm; minority class performance; multiBoost algorithm; multiBoost ensemble; Accuracy; Boosting; Committees; Gaussian distribution; Machine learning algorithms; Standards; Training; Class imbalance learning; MultiBoost; distribution based resampling; ensemble learning;
Conference_Titel :
Innovative Computing Technology (INTECH), 2015 Fifth International Conference on
Conference_Location :
Galcia
DOI :
10.1109/INTECH.2015.7173365