Title :
A System for Handwritten and Machine-Printed Text Separation in Bangla Document Images
Author :
Banerjee, Prithu ; Chaudhuri, Bidyut B.
Author_Institution :
Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Kolkata, India
Abstract :
In this paper, we describe an approach to distinguish between hand-written text and machine-printed text from annotated machine-printed Bangla Documents images. In applications involving OCR, distinction of machine-printed and hand-written characters is important, so that they can be sent to separate recognition engines. Identification of hand-written parts is useful in deleting those parts and cleaning the document image as well. In this paper a classification system is presented which takes a connected component in the document image and assigns them to two classes namely "machine-printed" and for "hand-written" classes, respectively. The proposed system contains a preprocessing step, which smoothes the object border and finds the Connected Component. Bangla script specific features are extracted from that Connected Component image, and a standard classifier based on SVM generates the final response. Experimental results on a data set show that the proposed approach achieves an overall accuracy of 96.49%.
Keywords :
document image processing; feature extraction; handwritten character recognition; image classification; natural language processing; optical character recognition; support vector machines; text detection; Bangla script specific feature extraction; OCR; SVM; annotated machine-printed Bangla document images; classification system; connected component image; hand-written characters; hand-written class; hand-written parts; handwritten text separation; machine-printed characters; machine-printed class; machine-printed text separation; object border; recognition engines; standard classifier; Accuracy; Feature extraction; Handwriting recognition; Support vector machines; Text recognition; Training; Bangla Script Recognition; Printed and Handwritten Text Separation; SVM Classifier;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
DOI :
10.1109/ICFHR.2012.171