• DocumentCode
    3128132
  • Title

    Certainty-Enhanced Active Learning for Improving Imbalanced Data Classification

  • Author

    Fu, JuiHsi ; Lee, SingLing

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
  • fYear
    2011
  • fDate
    11-11 Dec. 2011
  • Firstpage
    405
  • Lastpage
    412
  • Abstract
    In active learning algorithms, informative samples are usually queried for true labels according to the disagreement of existing hypotheses. However we observed that, when the streaming dataset has skewed class membership, the imbalanced data classification problem is caused in active learning. The Minority class is overwhelmed by the majority class in generating the hypotheses. In this paper, for each unlabeled sample we propose to utilize only local behavior in the certainty-enhanced neighborhood, rather than the entire dataset, to generate the error minimization hypotheses. Consequently, our proposed method enhances the prediction of hypotheses and is able to determine the query probabilities properly. In our experiments, synthetic and real-world datasets are used for presenting the effectiveness of our active learning approach. It is shown that the proposed approach decreases the probability of querying a certain (majority) sample and has the ability of dealing with the imbalanced data classification problem in active learning.
  • Keywords
    learning (artificial intelligence); pattern classification; probability; certainty-enhanced active learning; error minimization hypotheses; improving imbalanced data classification; majority class; minority class; query probabilities; skewed class membership; Algorithm design and analysis; Complexity theory; Measurement uncertainty; Minimization; Polynomials; Supervised learning; Support vector machines; Active Learning; Certainty-Enhanced Neighborhood; Imbalanced Data Classification; Lazy Learning; Streaming Datasets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    978-1-4673-0005-6
  • Type

    conf

  • DOI
    10.1109/ICDMW.2011.43
  • Filename
    6137408