• DocumentCode
    135089
  • Title

    Text line identification in Tagore´s manuscript

  • Author

    Adak, Chandranath ; Chaudhuri, Bidyut B.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Kalyani, Kalyani, India
  • fYear
    2014
  • fDate
    Feb. 28 2014-March 2 2014
  • Firstpage
    210
  • Lastpage
    213
  • Abstract
    In this paper, a text line identification method is proposed. The text lines of printed document are easy to segment due to uniform straightness of the lines and sufficient gap between the lines. But in handwritten documents, the line is nonuniform and interline gaps are variable. We take Rabindranath Tagore´s manuscript as it is one of the most difficult manuscripts that contain doodles. Our method consists of a preprocessing stage to clean the document image. Then we separate doodles from the manuscript to get the textual region. After that we identify the text lines on the manuscript. For text line identification, we use window examination, black run-length smearing, horizontal histogram and connected component analysis.
  • Keywords
    document image processing; handwritten character recognition; image segmentation; optical character recognition; text analysis; Rabindranath Tagore manuscript; black run-length smearing; connected component analysis; document image cleaning; doodle separation; handwritten documents; horizontal histogram; nonuniform line; preprocessing stage; printed document; text line identification method; text line segmentation; textual region; uniform line straightness; variable interline gaps; window examination; Character recognition; Frequency modulation; Handwriting recognition; Histograms; Image analysis; Optical filters; Text analysis; document image analysis; doodle; handwritten document; manuscript processing; text line identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Students' Technology Symposium (TechSym), 2014 IEEE
  • Conference_Location
    Kharagpur
  • Print_ISBN
    978-1-4799-2607-7
  • Type

    conf

  • DOI
    10.1109/TechSym.2014.6808048
  • Filename
    6808048