DocumentCode :
2008865
Title :
Research of PU Text Semi-supervised Classification Based on Ontology Feature Extraction
Author :
Luo, Na ; Yuan, Fuyu ; Zuo, Wanli ; He, Fengling
Author_Institution :
Coll. of Comput. Sci. & Technol., Jilin Univ., Changchun
fYear :
2008
fDate :
11-13 Dec. 2008
Firstpage :
835
Lastpage :
838
Abstract :
For the shortcomings in the method of traditional statistics-based feature extraction on PU issues, we put forward feature extraction based on ontology to improve the performance of PU classification. We improved PEBL algorithm, and get the document vector of positive set using ontology-based feature extraction, then find the strong positive features, which include the crossing semantics in the positive documents and have higher frequency in positive set. The improved algorithm scans the documents twice. First, we get the semantic of the documents by ontology. Second, we filtrate the terms which include none of these semantic to reduce the dimension and obtain the document vector. Experiments had shown that the improved PEBL classifier increases the F1 score by 0.7389%.
Keywords :
classification; feature extraction; learning (artificial intelligence); ontologies (artificial intelligence); text analysis; PEBL algorithm; dimension reduction; document semantics; document vector; ontology feature extraction; positive-unlabeled semisupervised text classification; Application software; Chemical technology; Computer science; Educational institutions; Feature extraction; Frequency; Laboratories; Machine learning; Ontologies; Text categorization; F Score; Ontology; PU; Semi-supervised;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-0-7695-3495-4
Type :
conf
DOI :
10.1109/ICMLA.2008.19
Filename :
4725076
Link To Document :
بازگشت