DocumentCode :
2198339
Title :
User-Defined Expected Error Rate in OCR Postprocessing by Means of Automatic Threshold Estimation
Author :
Navarro-Cerdan, J. Ramon ; Arlandis, Joaquim ; Perez-Cortes, Juan-Carlos ; Llobet, Rafael
Author_Institution :
Inst. Tecnol. de Inf., Univ. Politec. de Valencia, Valencia, Spain
fYear :
2010
fDate :
16-18 Nov. 2010
Firstpage :
405
Lastpage :
409
Abstract :
In this work, a method for the automatic estimation of a threshold that allows the user of an OCR system to define an expected error rate is presented. When the OCR output is post-processed using a language model, a probability, a reliability index (or a “transformation cost”) is usually obtained, reflecting the likelihood (or its inverse) that the string of OCR hypotheses belongs to the model. Using a threshold on this index (or cost) to reject the less reliable hypotheses, a variable level of expected accuracy can be imposed on the output. It is much more convenient for the user the ability to “fix” at an acceptable level the expected error rate instead of having to deal with an arbitrary threshold. Of course, the result will always be high reject rates for difficult tasks and lower reject rates for easier tasks.
Keywords :
image segmentation; optical character recognition; probability; OCR postprocessing; automatic threshold estimation; language model; optical character recognition; probability; reliability index; user defined expected error rate;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4244-8353-2
Type :
conf
DOI :
10.1109/ICFHR.2010.126
Filename :
5693597
Link To Document :
بازگشت