DocumentCode :
534911
Title :
Hierarchical exploration based active learning with support vector machine
Author :
Yang, Yanping ; Song, Enmin ; Ma, Guangzhi
Author_Institution :
Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
Volume :
1
fYear :
2010
fDate :
13-14 Sept. 2010
Firstpage :
299
Lastpage :
302
Abstract :
The goal of active learning is to minimize the amount of labeled data required for machine learning. Some methods have focused on exploiting the samples with high uncertainty, but those methods fail in getting a representative set of the data samples. Other methods try to explore the representative samples by utilizing the prior distribution of the dataset. However, they are often computationally expensive and need a large amount of labeled data for initialization. In this paper we develop a hierarchical exploration based active learning algorithm that takes into account both the distribution of the dataset and the decision boundary of the current hypothesis. Our method uses the support vector machine (SVM) as the classifier. The hierarchical clustering algorithm is used to discover the dataset´s structure step by step in a top-down manner. In each step of hierarchical structure discovery, the representative samples will be queried for labels to check the relative cluster´s purity. The cluster with low purity will be divided further. After the draft SVM model is built with those representative samples, the uncertain samples near decision boundary will be further labeled if it can help reduce the entropy of the classifier. To show the effectiveness of the proposed method, our proposed method is compared with five state-of-art algorithms on six datasets from UCI. Our method shows the best performance through the comparison.
Keywords :
data structures; learning (artificial intelligence); pattern classification; support vector machines; uncertainty handling; active learning; classifier; data representation; dataset´s structure; hierarchical clustering algorithm; hierarchical exploration; labeled data; machine learning; support vector machine; uncertainty; Educational institutions; active learning; hierarchical clustering; support; vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Natural Computing Proceedings (CINC), 2010 Second International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-7705-0
Type :
conf
DOI :
10.1109/CINC.2010.5643833
Filename :
5643833
Link To Document :
بازگشت