DocumentCode
2143402
Title
A Semi-supervised Ensemble Learning Approach for Character Labeling with Minimal Human Effort
Author
Vajda, Szilárd ; Junaidi, Akmal ; Fink, Gernot A.
Author_Institution
Dept. of Comput. Sci., Tech. Univ. Dortmund, Dortmund, Germany
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
259
Lastpage
263
Abstract
One of the major issues in handwritten character recognition is the efficient creation of ground truth to train and test the different recognizers. The manual labeling of the data by a human expert is a tedious and costly procedure. In this paper we propose an efficient and low-cost semi-automatic labeling system for character datasets. First, the data is represented in different abstraction levels, which is clustered after in an unsupervised manner. The different clusters are labeled by the human experts and finally an unanimity voting is considered to decide if a label is accepted or not. The experimental results prove that labeling only less than 0.5% of the training data is sufficient to achieve 86.21% recognition rate for a brand new script (Lampung) and 94.81% for the MNIST benchmark dataset, considering only a K-nearest neighbor classifier for recognition.
Keywords
handwritten character recognition; image recognition; learning (artificial intelligence); K-nearest neighbor classifier; character labeling; handwritten character recognition; low-cost semi-automatic labeling system; minimal human effort; semi-supervised ensemble learning; Character recognition; Clustering algorithms; Handwriting recognition; Humans; Labeling; Principal component analysis; Training; Lampung characters; ensemble learning; semi-supervised character labeling; unsupervised clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location
Beijing
ISSN
1520-5363
Print_ISBN
978-1-4577-1350-7
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2011.60
Filename
6065315
Link To Document