Title :
Shirorekha extraction in Character Segmentation for printed devanagri text in Document Image Processing
Author :
Shinde, A.B. ; Dandawate, Y.H.
Author_Institution :
Dept. of Electron. Eng., Padmabhooshan Vasantraodada Patil Inst. of Technol., Budhgaon, India
Abstract :
Finding Structural Layout, Text Line Segmentation, Word Level Segmentation and Character Level Segmentation is major step in offline OCR systems for Devanagari Script in Document Image Processing. This paper proposes a Word and Character Segmentation method for machine printed Devanagari text. A complete word and character segmentation system for Devanagari printed text is presented here. Sometimes, interline space and fused characters make line segmentation and character segmentation a difficult task respectively. We have tested our method on documents in Marathi scripts. A novel technique of character segmentation for printed Devanagari text is presented here. After removing the Shirorekha (header line) of Devanagari text, the bounding boxes are used to surround the segmented characters. Results obtained from this method are encouraging because of morphological operations. In this method we are proposing some basic morphological operations on the scanned document images and got much better results.
Keywords :
document image processing; feature extraction; image segmentation; text detection; Marathi scripts; Shirorekha extraction; bounding boxes; character segmentation system; document image processing; header line; machine printed Devanagari text; morphological operations; word segmentation system; Image resolution; Image segmentation; Optical imaging; Radio frequency; Character Segmentation; Devanagari Script; Line Segmentation; Structural Layout; Word Segmentation;
Conference_Titel :
India Conference (INDICON), 2014 Annual IEEE
Conference_Location :
Pune
Print_ISBN :
978-1-4799-5362-2
DOI :
10.1109/INDICON.2014.7030535