Title :
A Weighted Rough Set Method to Address the Class Imbalance Problem
Author :
Liu, Jin-Fu ; Yu, Da-Ren
Author_Institution :
Harbin Inst. of Technol., Harbin
Abstract :
The class imbalance problem has been said recently to hinder the performance of learning systems. Most of traditional learning algorithms are designed with the assumption of well-balanced datasets, and are biased towards the majority class and thus may predict poorly the minority class examples. In this paper, we develop weighted rough sets (WRS) to deal with this problem. In weighted rough sets, weighted entropy is introduced and extended to compute the information content introduced by attributes. A forward greedy weighted attribute reduction algorithm based on the weighted entropy and a weighted rule extraction algorithm are provided. The factors of weighted strength, weighted certainty and weighted cover are employed to evaluate the extracted rules. Finally, a decision algorithm based on the weighted strength factor is constructed. Based on weighted rough sets, a series of experiments on class imbalance learning are conducted on 20 UCI data sets. In the meaning of AUC and minority class accuracy, WRS achieves the better results than classical rough set in class imbalance learning. Moreover, the evaluation of extracted rules has greater influence than the selection of attributes on weighted rough set learning.
Keywords :
decision theory; entropy; greedy algorithms; learning (artificial intelligence); rough set theory; class imbalance problem; decision algorithm; forward greedy weighted attribute reduction algorithm; learning algorithm; learning system; weighted certainty; weighted cover; weighted entropy; weighted rough set; weighted rule extraction algorithm; weighted strength; Algorithm design and analysis; Cybernetics; Data mining; Entropy; Information systems; Learning systems; Machine learning; Machine learning algorithms; Rough sets; Training data; Class imbalance learning; Instance weighting; Rough sets; Rule extraction; Weighted entropy;
Conference_Titel :
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-0973-0
Electronic_ISBN :
978-1-4244-0973-0
DOI :
10.1109/ICMLC.2007.4370789