DocumentCode
1634731
Title
Text Line Segmentation Based on Morphology and Histogram Projection
Author
dos Santos, Rodrigo P. ; Clemente, Gabriela S. ; Ren, Tsang Ing ; Cavalcanti, G.D.C.
Author_Institution
Center of Inf., Fed. Univ. of Pernambuco, Recife, Brazil
fYear
2009
Firstpage
651
Lastpage
655
Abstract
Text extraction is an important phase in document recognition systems. In order to segment text from a page document it is necessary to detect all the possible manuscript text regions. In this article we propose an efficient algorithm to segment handwritten text lines. The text line algorithm uses a morphological operator to obtain the features of the images. Following, a sequence of histogram projection and recovery is proposed to obtain the line segmented region of the text. First, an Y histogram projection is performed which results in the text lines positions. To divide the lines in different regions a threshold is applied. After that, another threshold is used to eliminate false lines. These procedures, however, cause some loss on the text line area. So, a recovery method is proposed to minimize this effect. In order to detect the extreme positions of the text in the horizontal direction, an X histogram projection is applied. Then, as in the Y direction, another threshold is used to eliminate false words. Finally, in order to optimize the area of the manuscript text line, a text selection is carried out. Experimental results using the IAM-database showed that this new approach is robust, fast and produces very good score rates.
Keywords
document image processing; feature extraction; handwritten character recognition; image segmentation; text analysis; document recognition system; feature extraction; handwritten text line segmentation; histogram projection; manuscript text; morphological operator; Cultural differences; Data mining; Histograms; Image segmentation; Informatics; Morphological operations; Morphology; Robustness; Text analysis; Text recognition; Histogram Projection; Mathematical Morphology; Text Line Segmentation;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location
Barcelona
ISSN
1520-5363
Print_ISBN
978-1-4244-4500-4
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2009.183
Filename
5277563
Link To Document