• DocumentCode
    3488213
  • Title

    Text Line Detection in Corrupted and Damaged Historical Manuscripts

  • Author

    Rabaev, Irina ; Biller, Ofer ; El-Sana, Jihad ; Kedem, Klara ; Dinstein, Itshak

  • Author_Institution
    Dept. of Comput. Sci., Ben-Gurion Univ., Beer-Sheva, Israel
  • fYear
    2013
  • fDate
    25-28 Aug. 2013
  • Firstpage
    812
  • Lastpage
    816
  • Abstract
    Most of the algorithms proposed for text line detection are designed to process binary images as input. For severely degraded documents, binarization often introduces significant noise and other artifacts. In this work we present a novel method designed to detect text lines directly in gray scale images. The method consists of two stages. Potential characters are detected in the first stage. This is done by analyzing the evolution maps of connected components obtained by a sliding threshold. The detected potential characters are grouped into text lines in the second stage using sweep-line approach. The suggested method is especially powerful when applied to torn and damaged documents that other algorithms are not able to deal with.
  • Keywords
    document image processing; history; object detection; text analysis; binary image processing; evolution maps; gray scale images; historical manuscripts; potential characters detection; sliding threshold; sweep-line approach; text line detection; Accuracy; Databases; Frequency modulation; Handwriting recognition; Image segmentation; Noise; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2013.166
  • Filename
    6628731