DocumentCode :
949990
Title :
Script-Independent Text Line Segmentation in Freestyle Handwritten Documents
Author :
Yi Li ; Yefeng Zheng ; Doermann, David ; Jaeger, S. ; Yi Li
Volume :
30
Issue :
8
fYear :
2008
Firstpage :
1313
Lastpage :
1329
Abstract :
Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.
Keywords :
document image processing; estimation theory; handwritten character recognition; image segmentation; probability; set theory; text analysis; connected component based method; document analysis problem; freestyle handwritten document image segmentation; hand-printed document; level set method; machine printed document; probability map estimation; script-independent curvilinear text line segmentation; Document analysis; Document and Text Processing; Handwriting analysis; Algorithms; Artificial Intelligence; Automatic Data Processing; Documentation; Handwriting; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/TPAMI.2007.70792
Filename :
4359385
Link To Document :
بازگشت