Title of article :
A Conclusive Methodology for Rating OCR Performance
Author/Authors :
Nathan E. Brener، نويسنده , , S.S. Iyengar، نويسنده , , and O.S. Pianykh، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2005
Abstract :
One of the most challenging topics in the automatic document
rating process is the development of a rating
scheme for the image quality of documents. As part of
the Department of Energy (DOE) document declassification
program, we have developed a generalized rating
system to predict the optical character recognition
(OCR) accuracy level that is achieved when processing a
document. The need for such a system emerged from
the declassification of degraded, typewriter-era documents,
which is currently a time-consuming manual
process. This article presents the statistical analysis of
the most influential document quality features affecting
OCR accuracy, develops consistent predictive models
for four currently used OCR engines, and studies the
applicability of different OCR products to the DOE document
declassification process. This study is expected to
lead to an efficient and completely automated document
declassification system.
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology