DocumentCode
700062
Title
Modeling image degradations for improving OCR
Author
Barney Smith, Elisa H.
Author_Institution
Electr. & Comput. Eng., Boise State Univ., Boise, ID, USA
fYear
2008
fDate
25-29 Aug. 2008
Firstpage
1
Lastpage
5
Abstract
Clean documents are relatively easy to recognize. However, when digitizing collections of documents, the clean ones are rarely the documents that are encountered. The processes of printing and scanning documents introduce image degradations that interfere with the segmentation and recognition processes. Mathematical models of the degradation processes are presented. From these the types of degradations that are seen can be quantitatively and qualitatively described. Included in the discussion are sampling, edge spread, corner erosion, and edge noise. The relationship between these degradations and common OCR errors is described. By considering the degradation model, a theoretical foundation is available to improve the document recognition process.
Keywords
edge detection; image denoising; image sampling; optical character recognition; OCR; corner erosion; document recognition; edge noise; edge spread; image degradations; image recognition; image sampling; image segmentation; Additive noise; Degradation; Europe; Image edge detection; Optical character recognition software;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference, 2008 16th European
Conference_Location
Lausanne
ISSN
2219-5491
Type
conf
Filename
7080594
Link To Document