DocumentCode :
2149266
Title :
Script-Free Text Line Segmentation Using Interline Space Model for Printed Document Images
Author :
Kim, Minwoo ; Oh, Il-Seok
Author_Institution :
Div. of Comput. Sci. & Eng., Chonbuk Nat. Univ., Jeonju, South Korea
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
1354
Lastpage :
1358
Abstract :
This paper proposes a model-based text line segmentation algorithm for machine-printed document images. The model is based on geometric configuration which uses the interline spaces rather than the text lines. The paper proposes an objective function whose maximization leads to the optimal solution. The proposed interline space model provides the primary advantage of script-free nature. Additionally the model is versatile due to its abilities of processing both horizontally and vertically written documents and inferring the semantic of reading order. The experiments performed with various document images in Latin, Korean, Chinese, and Japanese scripts have proven the aforementioned advantages and have shown the noise tolerance.
Keywords :
document image processing; image segmentation; optimisation; text analysis; Chinese scripts; Japanese scripts; Korean scripts; Latin scripts; geometric configuration; interline space model; machine printed document image processing; maximization; model based text line segmentation algorithm; noise tolerance; objective function; optimal solution; script free text line segmentation; written document processing; Algorithm design and analysis; Analytical models; Floors; Image segmentation; Noise; Pattern analysis; Text analysis; geometric matching; interline space; model-based approach; reading order; text line segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.272
Filename :
6065531
Link To Document :
بازگشت