• DocumentCode
    1186142
  • Title

    Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval

  • Author

    Hoi, Steven C H ; Jin, Rong ; Lyu, Michael R.

  • Author_Institution
    Div. of Inf. Syst., Nanyang Technol. Univ., Singapore, Singapore
  • Volume
    21
  • Issue
    9
  • fYear
    2009
  • Firstpage
    1233
  • Lastpage
    1248
  • Abstract
    Most machine learning tasks in data classification and information retrieval require manually labeled data examples in the training stage. The goal of active learning is to select the most informative examples for manual labeling in these learning tasks. Most of the previous studies in active learning have focused on selecting a single unlabeled example in each iteration. This could be inefficient, since the classification model has to be retrained for every acquired labeled example. It is also inappropriate for the setup of information retrieval tasks where the user´s relevance feedback is often provided for the top K retrieved items. In this paper, we present a framework for batch mode active learning, which selects a number of informative examples for manual labeling in each iteration. The key feature of batch mode active learning is to reduce the redundancy among the selected examples such that each example provides unique information for model updating. To this end, we employ the Fisher information matrix as the measurement of model uncertainty, and choose the set of unlabeled examples that can efficiently reduce the Fisher information of the classification model. We apply our batch mode active learning framework to both text categorization and image retrieval. Promising results show that our algorithms are significantly more effective than the active learning approaches that select unlabeled examples based only on their informativeness for the classification model.
  • Keywords
    image classification; image retrieval; learning (artificial intelligence); matrix algebra; text analysis; Fisher information matrix; batch mode active learning framework; data classification model; image retrieval; information retrieval; machine learning; text categorization; user relevance feedback; Batch mode active learning; convex optimization; image retrieval.; kernel logistic regressions; logistic regressions; text categorization;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2009.60
  • Filename
    4798162