DocumentCode
1627090
Title
Detection and segmentation of lines and words in Gurmukhi handwritten text
Author
Kumar, Rajiv ; Singh, Amardeep
Author_Institution
SMCA, Thapar Univ., Patiala, India
fYear
2010
Firstpage
353
Lastpage
356
Abstract
The scanned text image is a non editable image though it has the text but one can not edit it or make any change, if required, to that scanned document. This provides a basis for the optical character recognition (OCR) theory. OCR is the process of recognizing a segmented part of the scanned image as a character. The overall OCR process consists of three major sub processes like pre processing, segmentation and then recognition. Out of these three, the segmentation process is the back bone of the overall OCR process. We can say that the segmentation process is the most significant process because if the segmentation is incorrect then we can not have the correct results; it is just like garbage in and garbage out. But it is not an easy job, because segmentation is one of the complex processes. It is more difficult if the document is handwritten because in that case only few points are there which can be used to make segmentation. In this paper, we formulate an approach to segment the scanned document image. As per this approach, initially this considers the whole image as one large window. Then this large window is broken into less large windows giving lines, once the lines are identified then each window consisting of a line is used to find a word present in that line and finally to characters. For that purpose we used the concept of variable sized window, that is, the window whose size can be adjusted according to needs. This concept was implemented and results were analyzed. After the analysis the same concept was modified and finally tried on different documents and we got good reasonable results.
Keywords
handwritten character recognition; image segmentation; optical character recognition; Gurmukhi handwritten text; OCR recognition; OCR theory; image preprocessing; lines segmentation; optical character recognition theory; scanned document image segmentation; scanned text image; words segmentation; Application software; Banking; Bones; Character recognition; Computer vision; Flowcharts; Image recognition; Image segmentation; Office automation; Optical character recognition software; Characteristics; Flexible; Gurmukhi; Handwritten; OCR; Segmentation;
fLanguage
English
Publisher
ieee
Conference_Titel
Advance Computing Conference (IACC), 2010 IEEE 2nd International
Conference_Location
Patiala
Print_ISBN
978-1-4244-4790-9
Electronic_ISBN
978-1-4244-4791-6
Type
conf
DOI
10.1109/IADCC.2010.5422927
Filename
5422927
Link To Document