Title :
An Integration of CoTraining and Affinity Propagation for PU Text Classification
Author :
Luo, Na ; Yuan, Fuyu ; Zuo, Wanli
Author_Institution :
Coll. of Comput. & Sci. & Technol., JiLin Univ., Changchun
Abstract :
Under the framework of PU(Positive data and Unlabeled data), this paper originally proposes a three-setp algorithm. First, CoTraining is employed for filtering out the likely positive data from the unlabeled dataset U. Second, affinity propagation (AP) approach attempts to pick out the strong positive from likely positive set which is produced in first step. Those data picked out can be supplied to positive dataset P. Finally, a linear One-Class SVM will learn from both the purified U as negative and the expanded P as positive. Because of the algorithm´s characteristic of automatic expanding positive dataset, the proposed algorithm especially performs well in situations where given positive dataset P is insufficient. A comprehensive experiment had proved that our algorithm is preferable to the existing ones.
Keywords :
support vector machines; text analysis; text editing; CoTraining; affinity propagation; linear one-class SVM; positive dataset; support vector machine; text classification; three-setp algorithm; unlabeled data; Chemical technology; Chemistry; Data engineering; Educational institutions; Employment; Filtering algorithms; Laboratories; Supervised learning; Support vector machines; Text categorization; Affinity Propagation; CoTraining; PU Classification;
Conference_Titel :
Computer Engineering and Technology, 2009. ICCET '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-3334-6
DOI :
10.1109/ICCET.2009.131