DocumentCode
3394637
Title
Trimming approach for word segmentation with focus on overlapping characters
Author
Gomathi, S. ; Devi, R. S. Uma ; Mohanavel, S.
Author_Institution
Dept. of Comput. Sci., Sri Ramakrishna Eng. Coll., Coimbatore, India
fYear
2013
fDate
4-6 Jan. 2013
Firstpage
1
Lastpage
4
Abstract
Document image analysis methods fail in case of freestyle handwritten documents, in which texts are curvilinear and gaps between words are nonuniform. This paper introduces a relatively simple method, which is more tolerant to such cases. In the proposed method, word segmentation requires the document to be already segmented into text lines. The proposed system begins with pre-processing the scanned image of the handwritten text, to increase the accuracy of recognition by enhancing some features and eliminating some inconsistencies. It solves the issue of spatial measure and threshold, which are sensitive to shape the connected component (CC), by reducing the region of interest to core region. This method rectifies the problem of segmenting words from lines using bounding box (BB) method, which suppresses the structure of character. Trimmed mean (TM) is used to detect the core region and also as threshold for gap discrimination in this segmentation method. The system was developed in Java and its performance was evaluated on word images selected from the IAM database. Applying the segmentation scheme on 1100 text lines earned 96.7% of accuracy; on the other hand BB method produced only 90.1%.
Keywords
Java; document image processing; handwriting recognition; image segmentation; text detection; BB method; CC; IAM database; Java; TM; bounding box method; connected component; core region detection; document image analysis methods; freestyle handwritten documents; gap discrimination; overlapping characters; region of interest; text lines; trimmed mean; trimming approach; word segmentation; Databases; Handwriting recognition; Image segmentation; Measurement; Text analysis; Text recognition; Connected Component; Distance Computation; Gap Discrimination; Trimmed Mean;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Communication and Informatics (ICCCI), 2013 International Conference on
Conference_Location
Coimbatore
Print_ISBN
978-1-4673-2906-4
Type
conf
DOI
10.1109/ICCCI.2013.6466272
Filename
6466272
Link To Document