DocumentCode :
1514484
Title :
Manifold Adaptive Experimental Design for Text Categorization
Author :
Cai, Deng ; He, Xiaofei
Author_Institution :
State Key Lab. of CAD&CG, Zhejiang Univ., Hangzhou, China
Volume :
24
Issue :
4
fYear :
2012
fDate :
4/1/2012 12:00:00 AM
Firstpage :
707
Lastpage :
719
Abstract :
In many information processing tasks, labels are usually expensive and the unlabeled data points are abundant. To reduce the cost on collecting labels, it is crucial to predict which unlabeled examples are the most informative, i.e., improve the classifier the most if they were labeled. Many active learning techniques have been proposed for text categorization, such as SVMActive and Transductive Experimental Design. However, most of previous approaches try to discover the discriminant structure of the data space, whereas the geometrical structure is not well respected. In this paper, we propose a novel active learning algorithm which is performed in the data manifold adaptive kernel space. The manifold structure is incorporated into the kernel space by using graph Laplacian. This way, the manifold adaptive kernel space reflects the underlying geometry of the data. By minimizing the expected error with respect to the optimal classifier, we can select the most representative and discriminative data points for labeling. Experimental results on text categorization have demonstrated the effectiveness of our proposed approach.
Keywords :
classification; data handling; graph theory; learning (artificial intelligence); support vector machines; text analysis; SVMActive; active learning algorithm; active learning techniques; cost reduction; data manifold adaptive kernel space; data space; discriminant structure; discriminative data points; geometrical structure; graph Laplacian; information processing tasks; manifold adaptive experimental design; manifold structure; optimal classifier; text categorization; transductive experimental design; unlabeled data points; Algorithm design and analysis; Kernel; Laplace equations; Manifolds; Nearest neighbor searches; Optimization; Text categorization; Text categorization; active learning; experimental design; kernel method.; manifold learning;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2011.104
Filename :
5765958
Link To Document :
بازگشت