DocumentCode :
1946455
Title :
Combining clustering coefficient-based active learning and semi-supervised learning on networked data
Author :
He, Xiaoqi ; Liu, Yangguang ; Xu, Bin ; Jin, Xiaogang
Author_Institution :
Ningbo Inst. of Technol., Zhejiang Univ., Ningbo, China
fYear :
2010
fDate :
15-16 Nov. 2010
Firstpage :
305
Lastpage :
309
Abstract :
Active learning and semi-supervised learning are both important techniques to improve the learned model using unlabeled data, when labeled data is difficult to obtain, and unlabeled data is available in large quantity and easy to collect. Combining active learning with a semi-supervised learning algorithm that uses Gaussian field and harmonic functions was suggested recently. This work showed that empirical risk minimization (ERM) could find the next instance to label effectively, but the computation time consumption with ERM was large. In the case where the data is graphical in nature, we can leverage the graph topological analysis to rapidly select instances that are likely to be good candidates for labeling. This paper describes a novel approach of using clustering coefficient metric to identify the best instance next to label. We experiment on the 20 newsgroups dataset with three binary classification tasks, the results show that clustering coefficient strategy has similar performance to ERM with less time consumption.
Keywords :
Gaussian processes; graph theory; harmonic analysis; learning (artificial intelligence); pattern classification; pattern clustering; risk analysis; Gaussian field; active learning; binary classification; clustering coefficient; empirical risk minimization; graph topological analysis; harmonic function; semisupervised learning; Accuracy; Harmonic analysis; Learning; Machine learning; Measurement; Risk management; Uncertainty; active learning; clustering coefficient; empirical risk minimization; semi-supervised learning; uncertainty;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems and Knowledge Engineering (ISKE), 2010 International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4244-6791-4
Type :
conf
DOI :
10.1109/ISKE.2010.5680858
Filename :
5680858
Link To Document :
بازگشت