DocumentCode
178436
Title
Word Spotting in Bangla and English Graphical Documents
Author
Tarafdar, A. ; Pal, U. ; Ramel, J.-Y. ; Ragot, N. ; Chaudhuri, B.B.
Author_Institution
CVPR Unit, Indian Stat. Inst., Kolkata, India
fYear
2014
fDate
24-28 Aug. 2014
Firstpage
3044
Lastpage
3049
Abstract
Word spotting in graphical documents is a very challenging task. With an increase usage of electronic media, we are in a need of searching objects in graphical documents by some labeled text. To address such scenarios we propose a word spotting system dedicated to graphical documents with Bangla and English scripts. In our proposed system, first text-graphics layers are separated using Gabor filter. In the text layer, character segmentation approach is applied using water reservoir based method to extract each character from the document. Then recognition of these isolated characters is done using rotation invariant feature, coupled with SVM classifier. Well recognized characters are then grouped based on their sizes. Initial spotting is started to find a query word among those groups of characters. In case if the system could spot a word partially due to any noise, SIFT is applied to identify missing portion of that partial spotting. Experimental results on English and Bangla script document images show that the method is feasible to spot a location in text labeled graphical documents.
Keywords
Gabor filters; document image processing; image classification; image retrieval; image segmentation; natural language processing; optical character recognition; support vector machines; text analysis; transforms; Bangla graphical documents; Bangla scripts; English graphical documents; English scripts; Gabor filter; SIFT; SVM classifier; character segmentation approach; electronic media usage; isolated character recognition; object searching; query word; rotation invariant feature; scale invariant feature transform; support vector machine; text labeled graphical documents; text-graphics layers; word spotting system; Character recognition; Feature extraction; Gabor filters; Graphics; Reservoirs; Support vector machines; Clustering; Document Image Analysis; Gabor Filter; Graphical documents; Information Retrieval; SIFT feature; Water Reservoir Principle; Word Spotting;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition (ICPR), 2014 22nd International Conference on
Conference_Location
Stockholm
ISSN
1051-4651
Type
conf
DOI
10.1109/ICPR.2014.525
Filename
6977237
Link To Document