DocumentCode
2146801
Title
Restoration of Arbitrarily Warped Historical Document Images Using Flow Lines
Author
Rahnemoonfar, Maryam ; Antonacopoulos, Apostolos
Author_Institution
Pattern Recognition & Image Anal. (PRImA) Res. Lab., Univ. of Salford, Salford, UK
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
905
Lastpage
909
Abstract
Historical documents frequently suffer from arbitrary geometric distortions (warping and folds) due to storage conditions, use and to, some extent, the printing process of the time. In addition, page curl can be prominent due to the scanning technique used. Such distortions adversely affect OCR and print-on-demand quality. Previous approaches to geometric restoration either focus only on the correction of page curl or require supplementary information obtained by additional scanning hardware - not practical for existing scans. This paper presents a new approach to detect and restore arbitrary warping and folds, in addition to page curl. Warped text lines and the smooth deformation between them are precisely modelled as primary and secondary flow lines that are then restored to their original linear shape. Preliminary, but representative, experimental results, in comparison to a leading page curl removal method and an industry-standard commercial system, demonstrate the effectiveness of the proposed method.
Keywords
document image processing; optical character recognition; printing; text analysis; OCR; arbitrarily warped historical document image; flow line; geometric distortion; geometric restoration; industry-standard commercial system; page curl; primary flow line; print-on-demand quality; printing process; scanning hardware; scanning technique; secondary flow line; warped text line; Text analysis; arbitrary warping; dewarping; flow lines; geometric correction; historical documents; page curl removal; text line modelling;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location
Beijing
ISSN
1520-5363
Print_ISBN
978-1-4577-1350-7
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2011.184
Filename
6065442
Link To Document