• DocumentCode
    1634731
  • Title

    Text Line Segmentation Based on Morphology and Histogram Projection

  • Author

    dos Santos, Rodrigo P. ; Clemente, Gabriela S. ; Ren, Tsang Ing ; Cavalcanti, G.D.C.

  • Author_Institution
    Center of Inf., Fed. Univ. of Pernambuco, Recife, Brazil
  • fYear
    2009
  • Firstpage
    651
  • Lastpage
    655
  • Abstract
    Text extraction is an important phase in document recognition systems. In order to segment text from a page document it is necessary to detect all the possible manuscript text regions. In this article we propose an efficient algorithm to segment handwritten text lines. The text line algorithm uses a morphological operator to obtain the features of the images. Following, a sequence of histogram projection and recovery is proposed to obtain the line segmented region of the text. First, an Y histogram projection is performed which results in the text lines positions. To divide the lines in different regions a threshold is applied. After that, another threshold is used to eliminate false lines. These procedures, however, cause some loss on the text line area. So, a recovery method is proposed to minimize this effect. In order to detect the extreme positions of the text in the horizontal direction, an X histogram projection is applied. Then, as in the Y direction, another threshold is used to eliminate false words. Finally, in order to optimize the area of the manuscript text line, a text selection is carried out. Experimental results using the IAM-database showed that this new approach is robust, fast and produces very good score rates.
  • Keywords
    document image processing; feature extraction; handwritten character recognition; image segmentation; text analysis; document recognition system; feature extraction; handwritten text line segmentation; histogram projection; manuscript text; morphological operator; Cultural differences; Data mining; Histograms; Image segmentation; Informatics; Morphological operations; Morphology; Robustness; Text analysis; Text recognition; Histogram Projection; Mathematical Morphology; Text Line Segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4244-4500-4
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2009.183
  • Filename
    5277563