DocumentCode
3106694
Title
Exploratory Under-Sampling for Class-Imbalance Learning
Author
Liu, Xu-Ying ; Wu, Jianxin ; Zhou, Zhi-Hua
Author_Institution
Nat. Lab. for Novel Software Technol., Nanjing Univ., Nanjing
fYear
2006
fDate
18-22 Dec. 2006
Firstpage
965
Lastpage
969
Abstract
Under-sampling is a class-imbalance learning method which uses only a subset of major class examples and thus is very efficient. The main deficiency is that many major class examples are ignored. We propose two algorithms to overcome the deficiency. EasyEnsemble samples several subsets from the major class, trains a learner using each of them, and combines the outputs of those learners. BalanceCascade is similar to EasyEnsemble except that it removes correctly classified major class examples of trained learners from further consideration. Experiments show that both of the proposed algorithms have better AUC scores than many existing class-imbalance learning methods. Moreover, they have approximately the same training time as that of under-sampling, which trains significantly faster than other methods.
Keywords
learning (artificial intelligence); BalanceCascade; EasyEnsemble; class-imbalance learning; exploratory undersampling; Data mining; Educational institutions; Laboratories; Learning systems; Sampling methods;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location
Hong Kong
ISSN
1550-4786
Print_ISBN
0-7695-2701-7
Type
conf
DOI
10.1109/ICDM.2006.68
Filename
4053136
Link To Document