Title :
A Semi-supervised Ensemble Learning Approach for Character Labeling with Minimal Human Effort
Author :
Vajda, Szilárd ; Junaidi, Akmal ; Fink, Gernot A.
Author_Institution :
Dept. of Comput. Sci., Tech. Univ. Dortmund, Dortmund, Germany
Abstract :
One of the major issues in handwritten character recognition is the efficient creation of ground truth to train and test the different recognizers. The manual labeling of the data by a human expert is a tedious and costly procedure. In this paper we propose an efficient and low-cost semi-automatic labeling system for character datasets. First, the data is represented in different abstraction levels, which is clustered after in an unsupervised manner. The different clusters are labeled by the human experts and finally an unanimity voting is considered to decide if a label is accepted or not. The experimental results prove that labeling only less than 0.5% of the training data is sufficient to achieve 86.21% recognition rate for a brand new script (Lampung) and 94.81% for the MNIST benchmark dataset, considering only a K-nearest neighbor classifier for recognition.
Keywords :
handwritten character recognition; image recognition; learning (artificial intelligence); K-nearest neighbor classifier; character labeling; handwritten character recognition; low-cost semi-automatic labeling system; minimal human effort; semi-supervised ensemble learning; Character recognition; Clustering algorithms; Handwriting recognition; Humans; Labeling; Principal component analysis; Training; Lampung characters; ensemble learning; semi-supervised character labeling; unsupervised clustering;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2011.60