DocumentCode
2834273
Title
Prediction of OCR accuracy using simple image features
Author
Blando, Luis R. ; Kanai, Junichi ; Nartker, Thomas A.
Author_Institution
Inf. Sci. Res. Inst., Nevada Univ., Las Vegas, NV, USA
Volume
1
fYear
1995
fDate
14-16 Aug 1995
Firstpage
319
Abstract
A classifier for predicting the character accuracy achieved by any Optical Character Recognition (OCR) system on a given page is presented. This classifier is based on measuring the amount of white speckle, the amount of character fragments, and overall size information in the page. No output from the OCR system is used. The given page is classified as either “good” quality (i.e. high OCR accuracy expected) or “poor” (i.e. low OCR accuracy expected). Results of processing 639 pages show a recognition rate of approximately 85%. This performance compares favorably with the ideal-case performance of a prediction method based upon the number of reject-markers in OCR generated text
Keywords
document image processing; image classification; optical character recognition; prediction theory; OCR; OCR generated text; character accuracy; classifier; recognition rate; white speckle; Accuracy; Adaptive optics; Character recognition; Costs; Degradation; Information science; Optical character recognition software; Optical filters; Prediction methods; Speckle;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location
Montreal, Que.
Print_ISBN
0-8186-7128-9
Type
conf
DOI
10.1109/ICDAR.1995.599003
Filename
599003
Link To Document