DocumentCode
183364
Title
Segmentation of Historical Handwritten Documents into Text Zones and Text Lines
Author
Gatos, Basilis ; Louloudis, Georgios ; Stamatopoulos, Nikolaos
Author_Institution
Comput. Intell. Lab., Nat. Center for Sci. Res. “Demokritos”, Athens, Greece
fYear
2014
fDate
1-4 Sept. 2014
Firstpage
464
Lastpage
469
Abstract
In order to achieve accurate text recognition performance for historical handwritten document images, robust and efficient page segmentation is necessary. In this paper, we propose a text zone detection followed by a text line segmentation method suitable for historical handwritten documents. Our aim is to handle several challenging cases such as horizontal and vertical rule lines overlapping with the text, two column documents and characters of different text lines touching vertically. For text zone detection, we analyze vertical rule lines, connected components as well as vertical white runs while for text line segmentation, we enhance an existing approach based on Hough transform in order to better treat cases of vertical connected characters. Both methods have been proved very promising after an evaluation using a set of historical handwritten documents.
Keywords
Hough transforms; document image processing; handwritten character recognition; history; image segmentation; text analysis; text detection; Hough transform; historical handwritten document images; horizontal rule lines; page segmentation; text line segmentation method; text recognition performance; text zone detection; vertical rule lines; Computational intelligence; Frequency modulation; Handwriting recognition; Image segmentation; Informatics; Laboratories; Telecommunications; historical document image processing; page segmentation; text line segmentation;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location
Heraklion
ISSN
2167-6445
Print_ISBN
978-1-4799-4335-7
Type
conf
DOI
10.1109/ICFHR.2014.84
Filename
6981063
Link To Document