DocumentCode
728842
Title
Distribution based ensemble for class imbalance learning
Author
Mustafa, Ghulam ; Zhendong Niu ; Yousif, Abdallah ; Tarus, John
Author_Institution
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
fYear
2015
fDate
20-22 May 2015
Firstpage
5
Lastpage
10
Abstract
MultiBoost ensemble has been well acknowledged as an effective learning algorithm which able to reduce both bias and variance in error and has high generalization performance. However, to deal with the class imbalanced learning, the Multi- Boost shall be amended. In this paper, a new hybrid machine learning method called Distribution based MultiBoost (DBMB) for class imbalanced problems is proposed, which combines Distribution based balanced sampling with the MultiBoost algorithm to achieve better minority class performance. It minimizes the within class and between class imbalance by learning and sampling different distributions (Gaussian and Poisson) and reduces bias and variance in error by employing the MultiBoost ensemble. Therefore, DBMB could output the final strong learner that is more proficient ensemble of weak base learners for imbalanced data sets. We prove that the G-mean, F1 measure and AUC of the DBMB is significantly superior to others. The experimental verification has shown that the proposed DBMB outperforms other state-of-the-art algorithms on many real world class imbalanced problems. Furthermore, our proposed method is scalable as compare to other boosting methods.
Keywords
Gaussian distribution; Poisson distribution; learning (artificial intelligence); sampling methods; AUC; DBMB; F1 measure; G-mean; Gaussian distributions; Poisson distributions; class imbalance learning; class imbalanced problems; distribution based balanced sampling; distribution based ensemble; distribution based multiBoost; error bias; error variance; hybrid machine learning method; learning algorithm; minority class performance; multiBoost algorithm; multiBoost ensemble; Accuracy; Boosting; Committees; Gaussian distribution; Machine learning algorithms; Standards; Training; Class imbalance learning; MultiBoost; distribution based resampling; ensemble learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Innovative Computing Technology (INTECH), 2015 Fifth International Conference on
Conference_Location
Galcia
Type
conf
DOI
10.1109/INTECH.2015.7173365
Filename
7173365
Link To Document