• DocumentCode
    3535502
  • Title

    Improving classification performance for the minority class in highly imbalanced dataset using boosting

  • Author

    Abouelenien, Mohamed ; Xiaohui Yuan ; Duraisamy, Prakash ; Xiaojing Yuan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of North Texas, Denton, TX, USA
  • fYear
    2012
  • fDate
    26-28 July 2012
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Data imbalance is a common property in many medical and biological data and usually results in degraded generalization performance. In this article, we present a novel boosting method to address two important questions in learning from imbalanced dataset: how to maximize the performance of classifying the minority instances without compromising the performance for the majority instances? and how to select training instances to achieve a comprehensive representation of the data distribution and avoid high computational time? Our method maximizes the usage of the available samples with priority given to the minority samples. The base classifiers are weighted with their sensitivities derived from the training examples. Using synthetic and real-world datasets, we demonstrated the performance improvement of our method in both sensitivity and accuracy without major reduction in specificity. In contrast to AdaBoost, our method took much less time, which makes it applicable in real-world problems that have large amount of data.
  • Keywords
    learning (artificial intelligence); pattern classification; AdaBoost; base classifier sensitivity; biological data; boosting method; classification performance; data distribution; highly imbalanced dataset; majority instances; medical data; minority class; minority instances; performance improvement; real-world datasets; synthetic datasets; training instance selection; Accuracy; Boosting; Image segmentation; Sensitivity; Silicon; Support vector machines; Training; Boosting; Classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing Communication & Networking Technologies (ICCCNT), 2012 Third International Conference on
  • Conference_Location
    Coimbatore
  • Type

    conf

  • DOI
    10.1109/ICCCNT.2012.6477850
  • Filename
    6477850