Title :
Tri-Training Based Learning from Positive and Unlabeled Data
Author :
Zhang, Bangzuo ; Zuo, Wanli
Author_Institution :
Coll. of Comput. Sci. & Technol., Jilin Univ., Changchun
Abstract :
This paper studies the problem of learning text classifier using positive and unlabeled examples with tri-training algorithm, which has been brought forward for semi-supervised learning. The key feature is that there are no negative examples. This paper proposed a new tri-training algorithm for the LPU problem that combines the step 1 of the three LPU algorithms to extract a reliable negative examples set, consequently to build an initial classifier for the tri-training and replace the bootstrap sampling procedure that has not been thought as a good method, and then iteratively use the three SVM classifiers until they convergence. Experiments on the popular Reuter21578 collection show the effectiveness of our proposed technique.
Keywords :
bootstrapping; classification; learning (artificial intelligence); sampling methods; support vector machines; text analysis; bootstrap sampling procedure; positive-unlabeled data learning; semi supervised learning; support vector machine classifier; text classifier learning; tri-training based learning; Bayesian methods; Convergence; Educational institutions; Iterative algorithms; Sampling methods; Semisupervised learning; Supervised learning; Support vector machine classification; Support vector machines; Training data; Learning From Positive And Unlabeled Data; Semi-supervised Learning; Tri-training;
Conference_Titel :
Information Processing (ISIP), 2008 International Symposiums on
Conference_Location :
Moscow
Print_ISBN :
978-0-7695-3151-9
DOI :
10.1109/ISIP.2008.69