Title :
Restoration of Arbitrarily Warped Historical Document Images Using Flow Lines
Author :
Rahnemoonfar, Maryam ; Antonacopoulos, Apostolos
Author_Institution :
Pattern Recognition & Image Anal. (PRImA) Res. Lab., Univ. of Salford, Salford, UK
Abstract :
Historical documents frequently suffer from arbitrary geometric distortions (warping and folds) due to storage conditions, use and to, some extent, the printing process of the time. In addition, page curl can be prominent due to the scanning technique used. Such distortions adversely affect OCR and print-on-demand quality. Previous approaches to geometric restoration either focus only on the correction of page curl or require supplementary information obtained by additional scanning hardware - not practical for existing scans. This paper presents a new approach to detect and restore arbitrary warping and folds, in addition to page curl. Warped text lines and the smooth deformation between them are precisely modelled as primary and secondary flow lines that are then restored to their original linear shape. Preliminary, but representative, experimental results, in comparison to a leading page curl removal method and an industry-standard commercial system, demonstrate the effectiveness of the proposed method.
Keywords :
document image processing; optical character recognition; printing; text analysis; OCR; arbitrarily warped historical document image; flow line; geometric distortion; geometric restoration; industry-standard commercial system; page curl; primary flow line; print-on-demand quality; printing process; scanning hardware; scanning technique; secondary flow line; warped text line; Text analysis; arbitrary warping; dewarping; flow lines; geometric correction; historical documents; page curl removal; text line modelling;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2011.184