Title :
Efficient Active Learning Based on Uncertain Clusters
Author :
Juihsi Fu ; Singling Lee ; Wangping Wu
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Chung Cheng Univ., Min-Hsiung, Taiwan
Abstract :
In active learning, raw samples are queried as few as possible to learn an accurate classifier. However, queried samples may encounter the problem of low diversity if they are selected without considering sample content. Then the classifier would be inefficiently resulted by the similar queried samples. In this paper, the approach, ALUC, is proposed to increase the diversity of queried uncertain samples. Raw samples are clustered based on the prior data distribution and sample uncertainty before they are queried. At first, the cluster seeds are found according to the underlying data distribution, without defining the number of clusters in advance. And the distance metric is designed to generate small clusters if they contain uncertain samples. Consequently representative samples of clusters are diverse in content and also informative to be queried. Through experimental results on a synthetic dataset and real-word datasets, it is shown that our distance metric for clustering is effective to find raw samples that are similar in content and uncertainty. And ALUC is able to query informative and diverse samples to result an accurate classifier.
Keywords :
learning (artificial intelligence); pattern classification; pattern clustering; query processing; uncertainty handling; ALUC; active learning; classifier; cluster seeds; data distribution; distance metric; sample querying; sample uncertainty; uncertain clusters; Accuracy; Clustering algorithms; Educational institutions; Learning systems; Measurement; Prediction algorithms; Uncertainty; active learning; clustering; uncertain clusters;
Conference_Titel :
Technologies and Applications of Artificial Intelligence (TAAI), 2012 Conference on
Conference_Location :
Tainan
Print_ISBN :
978-1-4673-4976-5
DOI :
10.1109/TAAI.2012.70