Title :
Flexible Text Recovery from Degraded Typewritten Historical Documents
Author :
Antonacopoulos, A. ; Castilla, C. Casado
Author_Institution :
Sch. of Comput., Sci. & Eng., Salford Univ.
Abstract :
The conversion of large collections of historical typewritten documents into digital libraries and archives is met with significant challenges that standard recognition techniques cannot address. The condition and individual nature of characters in these degraded documents necessitate a departure from existing thresholding approaches. This paper presents a flexible approach designed to overcome the difficulties presented by such documents by flexibly analysing each individual character and cautiously repairing it. The main sources of OCR errors are successfully addressed and reliable corrective actions are taken
Keywords :
document image processing; image segmentation; optical character recognition; OCR; degraded typewritten historical documents; digital libraries; flexible text recovery; image thresholding approaches; Degradation; Error correction; Focusing; Image analysis; Image converters; Optical character recognition software; Pattern recognition; Pressing; Software libraries; Text analysis;
Conference_Titel :
Pattern Recognition, 2006. ICPR 2006. 18th International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2521-0
DOI :
10.1109/ICPR.2006.581