DocumentCode :
3488307
Title :
Greedy Search for Active Learning of OCR
Author :
Agarwal, Abhishek ; Garg, Radhika ; Chaudhury, Santanu
Author_Institution :
Dept. of Electr. Eng., Indian Inst. of Technol., Delhi, New Delhi, India
fYear :
2013
fDate :
25-28 Aug. 2013
Firstpage :
837
Lastpage :
841
Abstract :
Active learning and crowd sourcing are becoming increasingly popular in the machine learning community for fast and cost effective generation of labels for large volumes of data. However, such labels may be noisy. So, it becomes important to ignore the noisy labels for building of a good classifier. We propose a framework for finding the best possible augmentation of a classifier for the character recognition problem using minimum number of crowd labeled samples. The approach inherently rejects the noisy data and tries to accept a subset of correctly labeled data to maximize the classifier performance.
Keywords :
image classification; learning (artificial intelligence); optical character recognition; search problems; OCR; active learning; character recognition problem; classifier; crowd labeled samples; greedy search; noisy data rejection; optical character recognition; Accuracy; Character recognition; Noise; Noise measurement; Optical character recognition software; Support vector machines; Training; Character recognition; Indian scripts; active learning; crowd sourcing; greedy search; incremental SVM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
ISSN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2013.171
Filename :
6628736
Link To Document :
بازگشت