Title :
Batch-Mode Active Learning with Semi-supervised Cluster Tree for Text Classification
Author :
Zhaocai Sun ; Yunming Ye ; Xiaofeng Zhang ; Zhexue Huang ; Shudong Chen ; Zhi Liu
Author_Institution :
Shenzhen Grad. Sch., Harbin Inst. of Technol., Shenzhen, China
Abstract :
In web mining, there are situations in which only few data is labeled which imposes difficulties on traditional web page classification algorithms. Active learning scheme is then proposed to sample the most representative unlabeled data, which are then annotated by external oracles. Most present active methods are based on series-mode query strategy, which deduces the process of active learning inefficient and unstable. In this paper, we propose a novel text oriented active semi-supervised classification model, which is so-called active SSC. Comparing with other active approaches, our model has the characteristic of comprehensibility, and thus it is easy to design a batch-mode query strategy. Experimental results on public text data showed our method is an effect and stable active approach.
Keywords :
Internet; data mining; learning (artificial intelligence); pattern classification; pattern clustering; text analysis; trees (mathematics); Web mining; active SSC; batch-mode active learning scheme; batch-mode query strategy; external oracles; public text data; representative unlabeled data; semisupervised cluster tree; text oriented active semisupervised classification model; active learning; batch mode; semi-supervised learning; text classification;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
Conference_Location :
Macau
Print_ISBN :
978-1-4673-6057-9
DOI :
10.1109/WI-IAT.2012.237