DocumentCode
1574045
Title
A Cluster-based Regrouping approach for Imbalanced data distributions
Author
Yu, Wen ; Jiang, ShengYi
Author_Institution
School of Management, Guangdong University of Foreign Studies, Guangzhou 510006, China
fYear
2012
Firstpage
121
Lastpage
124
Abstract
In real-world applications, it has been observed that class imbalance (significant differences in class prior probabilities) may produce an important deterioration of the classifier performance, in particular with patterns belonging to the less represented classes. In this paper, we propose a Cluster-based Regrouping approach (CR) which divides the whole training data into positive group and negative group by clustering through the outlier factor. As a result, the similar samples will be in the same group while the dissimilar samples will be in the different groups. Then the basic classifier is employed to build the models on both the positive group and the negative group respectively. When classifying the new object, the model used to evaluate will be chosen according to the type of the group which the new object is nearest. The experimental results demonstrate that our approach achieved promising performance in some cases by directly or indirectly reducing the class distribution skewness.
Keywords
C4.5; Imbalanced data classification; Naïve-bayes; One-pass clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
World Automation Congress (WAC), 2012
Conference_Location
Puerto Vallarta, Mexico
ISSN
2154-4824
Print_ISBN
978-1-4673-4497-5
Type
conf
Filename
6321051
Link To Document