DocumentCode :
2600200
Title :
Flexible Text Recovery from Degraded Typewritten Historical Documents
Author :
Antonacopoulos, A. ; Castilla, C. Casado
Author_Institution :
Sch. of Comput., Sci. & Eng., Salford Univ.
Volume :
2
fYear :
0
fDate :
0-0 0
Firstpage :
1062
Lastpage :
1065
Abstract :
The conversion of large collections of historical typewritten documents into digital libraries and archives is met with significant challenges that standard recognition techniques cannot address. The condition and individual nature of characters in these degraded documents necessitate a departure from existing thresholding approaches. This paper presents a flexible approach designed to overcome the difficulties presented by such documents by flexibly analysing each individual character and cautiously repairing it. The main sources of OCR errors are successfully addressed and reliable corrective actions are taken
Keywords :
document image processing; image segmentation; optical character recognition; OCR; degraded typewritten historical documents; digital libraries; flexible text recovery; image thresholding approaches; Degradation; Error correction; Focusing; Image analysis; Image converters; Optical character recognition software; Pattern recognition; Pressing; Software libraries; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2006. ICPR 2006. 18th International Conference on
Conference_Location :
Hong Kong
ISSN :
1051-4651
Print_ISBN :
0-7695-2521-0
Type :
conf
DOI :
10.1109/ICPR.2006.581
Filename :
1699391
Link To Document :
بازگشت