DocumentCode
135089
Title
Text line identification in Tagore´s manuscript
Author
Adak, Chandranath ; Chaudhuri, Bidyut B.
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of Kalyani, Kalyani, India
fYear
2014
fDate
Feb. 28 2014-March 2 2014
Firstpage
210
Lastpage
213
Abstract
In this paper, a text line identification method is proposed. The text lines of printed document are easy to segment due to uniform straightness of the lines and sufficient gap between the lines. But in handwritten documents, the line is nonuniform and interline gaps are variable. We take Rabindranath Tagore´s manuscript as it is one of the most difficult manuscripts that contain doodles. Our method consists of a preprocessing stage to clean the document image. Then we separate doodles from the manuscript to get the textual region. After that we identify the text lines on the manuscript. For text line identification, we use window examination, black run-length smearing, horizontal histogram and connected component analysis.
Keywords
document image processing; handwritten character recognition; image segmentation; optical character recognition; text analysis; Rabindranath Tagore manuscript; black run-length smearing; connected component analysis; document image cleaning; doodle separation; handwritten documents; horizontal histogram; nonuniform line; preprocessing stage; printed document; text line identification method; text line segmentation; textual region; uniform line straightness; variable interline gaps; window examination; Character recognition; Frequency modulation; Handwriting recognition; Histograms; Image analysis; Optical filters; Text analysis; document image analysis; doodle; handwritten document; manuscript processing; text line identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Students' Technology Symposium (TechSym), 2014 IEEE
Conference_Location
Kharagpur
Print_ISBN
978-1-4799-2607-7
Type
conf
DOI
10.1109/TechSym.2014.6808048
Filename
6808048
Link To Document