DocumentCode
2577641
Title
Restoring Chinese documents images based on text boundary lines
Author
Liu, Hong ; Ding, Runwei
Author_Institution
Key Lab. of Machine Perception & Intell., Peking Univ., Beijing, China
fYear
2009
fDate
11-14 Oct. 2009
Firstpage
571
Lastpage
576
Abstract
Distortion always appears in document images while scanning thick bound volumes. There are two kinds of distortion for the scanned grayscale images, shadow appears at the volumes´ spine area, and warping of the words occurs in the shadow. In this paper, a novel text boundary lines based method for efficient restoration of warped scanning Chinese document images is presented. We first detect on which side of an image the shadow lays by row grayscale analysis method. Then the shadow is removed by a modified Niblack´s algorithm. In order to detect the warped feature, a text boundary lines´ detection method is proposed. Finally, an adjustment method based on the text boundary lines is carried to restore the warped words. Experiments on 400 various scanning Chinese document images are implemented. The improvement on average character recall is 11.92% to 14.89%. Experiments show that the proposed restoration method is efficient for Chinese documents with both text and non-text regions.
Keywords
distortion; document image processing; image denoising; image restoration; object detection; text analysis; Chinese documents image restoration; adjustment method; document image distortion; image detection; modified Niblack algorithm; row grayscale analysis method; scanned grayscale images; text boundary lines based method; warped scanning Chinese document images; Books; Computer vision; Gray-scale; Image processing; Image reconstruction; Image restoration; Laboratories; Machine intelligence; Shape; Surface reconstruction; distortion; restoration; text bounary lines; warped document images;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on
Conference_Location
San Antonio, TX
ISSN
1062-922X
Print_ISBN
978-1-4244-2793-2
Electronic_ISBN
1062-922X
Type
conf
DOI
10.1109/ICSMC.2009.5346660
Filename
5346660
Link To Document