• DocumentCode
    2143402
  • Title

    A Semi-supervised Ensemble Learning Approach for Character Labeling with Minimal Human Effort

  • Author

    Vajda, Szilárd ; Junaidi, Akmal ; Fink, Gernot A.

  • Author_Institution
    Dept. of Comput. Sci., Tech. Univ. Dortmund, Dortmund, Germany
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    259
  • Lastpage
    263
  • Abstract
    One of the major issues in handwritten character recognition is the efficient creation of ground truth to train and test the different recognizers. The manual labeling of the data by a human expert is a tedious and costly procedure. In this paper we propose an efficient and low-cost semi-automatic labeling system for character datasets. First, the data is represented in different abstraction levels, which is clustered after in an unsupervised manner. The different clusters are labeled by the human experts and finally an unanimity voting is considered to decide if a label is accepted or not. The experimental results prove that labeling only less than 0.5% of the training data is sufficient to achieve 86.21% recognition rate for a brand new script (Lampung) and 94.81% for the MNIST benchmark dataset, considering only a K-nearest neighbor classifier for recognition.
  • Keywords
    handwritten character recognition; image recognition; learning (artificial intelligence); K-nearest neighbor classifier; character labeling; handwritten character recognition; low-cost semi-automatic labeling system; minimal human effort; semi-supervised ensemble learning; Character recognition; Clustering algorithms; Handwriting recognition; Humans; Labeling; Principal component analysis; Training; Lampung characters; ensemble learning; semi-supervised character labeling; unsupervised clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2011 International Conference on
  • Conference_Location
    Beijing
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4577-1350-7
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2011.60
  • Filename
    6065315