• DocumentCode
    2119209
  • Title

    Batch-Mode Active Learning with Semi-supervised Cluster Tree for Text Classification

  • Author

    Zhaocai Sun ; Yunming Ye ; Xiaofeng Zhang ; Zhexue Huang ; Shudong Chen ; Zhi Liu

  • Author_Institution
    Shenzhen Grad. Sch., Harbin Inst. of Technol., Shenzhen, China
  • Volume
    1
  • fYear
    2012
  • fDate
    4-7 Dec. 2012
  • Firstpage
    388
  • Lastpage
    395
  • Abstract
    In web mining, there are situations in which only few data is labeled which imposes difficulties on traditional web page classification algorithms. Active learning scheme is then proposed to sample the most representative unlabeled data, which are then annotated by external oracles. Most present active methods are based on series-mode query strategy, which deduces the process of active learning inefficient and unstable. In this paper, we propose a novel text oriented active semi-supervised classification model, which is so-called active SSC. Comparing with other active approaches, our model has the characteristic of comprehensibility, and thus it is easy to design a batch-mode query strategy. Experimental results on public text data showed our method is an effect and stable active approach.
  • Keywords
    Internet; data mining; learning (artificial intelligence); pattern classification; pattern clustering; text analysis; trees (mathematics); Web mining; active SSC; batch-mode active learning scheme; batch-mode query strategy; external oracles; public text data; representative unlabeled data; semisupervised cluster tree; text oriented active semisupervised classification model; active learning; batch mode; semi-supervised learning; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
  • Conference_Location
    Macau
  • Print_ISBN
    978-1-4673-6057-9
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2012.237
  • Filename
    6511913