• DocumentCode
    2834273
  • Title

    Prediction of OCR accuracy using simple image features

  • Author

    Blando, Luis R. ; Kanai, Junichi ; Nartker, Thomas A.

  • Author_Institution
    Inf. Sci. Res. Inst., Nevada Univ., Las Vegas, NV, USA
  • Volume
    1
  • fYear
    1995
  • fDate
    14-16 Aug 1995
  • Firstpage
    319
  • Abstract
    A classifier for predicting the character accuracy achieved by any Optical Character Recognition (OCR) system on a given page is presented. This classifier is based on measuring the amount of white speckle, the amount of character fragments, and overall size information in the page. No output from the OCR system is used. The given page is classified as either “good” quality (i.e. high OCR accuracy expected) or “poor” (i.e. low OCR accuracy expected). Results of processing 639 pages show a recognition rate of approximately 85%. This performance compares favorably with the ideal-case performance of a prediction method based upon the number of reject-markers in OCR generated text
  • Keywords
    document image processing; image classification; optical character recognition; prediction theory; OCR; OCR generated text; character accuracy; classifier; recognition rate; white speckle; Accuracy; Adaptive optics; Character recognition; Costs; Degradation; Information science; Optical character recognition software; Optical filters; Prediction methods; Speckle;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    0-8186-7128-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.1995.599003
  • Filename
    599003