DocumentCode
2131401
Title
Semantic Features for Multi-view Semi-supervised and Active Learning of Text Classification
Author
Sun, Shiliang
Author_Institution
Dept. of Comput. Sci. & Technol., East China Normal Univ., Shanghai
fYear
2008
fDate
15-19 Dec. 2008
Firstpage
731
Lastpage
735
Abstract
For multi-view learning, existing methods usually exploit originally provided features for classifier training, which ignore the latent correlation between different views. In this paper, semantic features integrating information from multiple views are extracted for pattern representation. Canonical correlation analysis is used to learn the representation of semantic spaces where semantic features are projections of original features on the basis vectors of the spaces. We investigate the feasibility of semantic features on two learning paradigms: semi-supervised learning and active learning. Experiments on text classification with two state-of-the-art multi-view learning algorithms co-training and co-testing indicate that this use of semantic features can lead to a significant improvement of performance.
Keywords
feature extraction; learning (artificial intelligence); text analysis; active learning; classifier training; multiview semi-supervised; pattern representation; semantic features; text classification; Computer science; Conferences; Data mining; Functional analysis; Hydrogen; Labeling; Semisupervised learning; Sun; Text categorization; Web pages; Multi-view learning; co-testing; co-training; data mining; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
Conference_Location
Pisa
Print_ISBN
978-0-7695-3503-6
Electronic_ISBN
978-0-7695-3503-6
Type
conf
DOI
10.1109/ICDMW.2008.13
Filename
4734000
Link To Document