Title :
Prediction of OCR accuracy using simple image features
Author :
Blando, Luis R. ; Kanai, Junichi ; Nartker, Thomas A.
Author_Institution :
Inf. Sci. Res. Inst., Nevada Univ., Las Vegas, NV, USA
Abstract :
A classifier for predicting the character accuracy achieved by any Optical Character Recognition (OCR) system on a given page is presented. This classifier is based on measuring the amount of white speckle, the amount of character fragments, and overall size information in the page. No output from the OCR system is used. The given page is classified as either “good” quality (i.e. high OCR accuracy expected) or “poor” (i.e. low OCR accuracy expected). Results of processing 639 pages show a recognition rate of approximately 85%. This performance compares favorably with the ideal-case performance of a prediction method based upon the number of reject-markers in OCR generated text
Keywords :
document image processing; image classification; optical character recognition; prediction theory; OCR; OCR generated text; character accuracy; classifier; recognition rate; white speckle; Accuracy; Adaptive optics; Character recognition; Costs; Degradation; Information science; Optical character recognition software; Optical filters; Prediction methods; Speckle;
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
DOI :
10.1109/ICDAR.1995.599003