• DocumentCode
    3489891
  • Title

    A Binarization-Free Clustering Approach to Segment Curved Text Lines in Historical Manuscripts

  • Author

    Garz, Angelika ; Fischer, Anath ; Bunke, Horst ; Ingold, Rolf

  • Author_Institution
    DIVA, Univ. of Fribourg, Fribourg, Switzerland
  • fYear
    2013
  • fDate
    25-28 Aug. 2013
  • Firstpage
    1290
  • Lastpage
    1294
  • Abstract
    Text line segmentation is one of the main parts of document image analysis, it provides crucial information for automated reading, word spotting, alignment between image and transcription, or indexing of documents. Yet it remains an open problem for handwritten historical documents because of complex layouts on the one hand, such as curved and touching text lines, and binarization problems on the other hand, caused by ornaments, wrinkles, stains, holes, etc. In this paper, we propose a binarization-free clustering method for text line segmentation that is not only able to cope with touching text lines, but also with complex baseline curvature. Avoiding the assumption of straight baselines, small interest point clusters are grouped into text lines based on their local orientation. Experiments conducted on artificially distorted images of the Saint Gall database show promising results.
  • Keywords
    document image processing; image segmentation; pattern clustering; automated reading; binarization-free clustering approach; complex baseline curvature; curved text line segmentation; document image analysis; handwritten historical documents; historical manuscripts; indexing; local orientation; saint gall database; small interest point clusters; straight baselines; word spotting; Accuracy; Context; Databases; Image segmentation; Layout; Noise; Text analysis; curved lines; historical documents; local features; text line segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2013.261
  • Filename
    6628822