• DocumentCode
    1574045
  • Title

    A Cluster-based Regrouping approach for Imbalanced data distributions

  • Author

    Yu, Wen ; Jiang, ShengYi

  • Author_Institution
    School of Management, Guangdong University of Foreign Studies, Guangzhou 510006, China
  • fYear
    2012
  • Firstpage
    121
  • Lastpage
    124
  • Abstract
    In real-world applications, it has been observed that class imbalance (significant differences in class prior probabilities) may produce an important deterioration of the classifier performance, in particular with patterns belonging to the less represented classes. In this paper, we propose a Cluster-based Regrouping approach (CR) which divides the whole training data into positive group and negative group by clustering through the outlier factor. As a result, the similar samples will be in the same group while the dissimilar samples will be in the different groups. Then the basic classifier is employed to build the models on both the positive group and the negative group respectively. When classifying the new object, the model used to evaluate will be chosen according to the type of the group which the new object is nearest. The experimental results demonstrate that our approach achieved promising performance in some cases by directly or indirectly reducing the class distribution skewness.
  • Keywords
    C4.5; Imbalanced data classification; Naïve-bayes; One-pass clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    World Automation Congress (WAC), 2012
  • Conference_Location
    Puerto Vallarta, Mexico
  • ISSN
    2154-4824
  • Print_ISBN
    978-1-4673-4497-5
  • Type

    conf

  • Filename
    6321051