• DocumentCode
    178436
  • Title

    Word Spotting in Bangla and English Graphical Documents

  • Author

    Tarafdar, A. ; Pal, U. ; Ramel, J.-Y. ; Ragot, N. ; Chaudhuri, B.B.

  • Author_Institution
    CVPR Unit, Indian Stat. Inst., Kolkata, India
  • fYear
    2014
  • fDate
    24-28 Aug. 2014
  • Firstpage
    3044
  • Lastpage
    3049
  • Abstract
    Word spotting in graphical documents is a very challenging task. With an increase usage of electronic media, we are in a need of searching objects in graphical documents by some labeled text. To address such scenarios we propose a word spotting system dedicated to graphical documents with Bangla and English scripts. In our proposed system, first text-graphics layers are separated using Gabor filter. In the text layer, character segmentation approach is applied using water reservoir based method to extract each character from the document. Then recognition of these isolated characters is done using rotation invariant feature, coupled with SVM classifier. Well recognized characters are then grouped based on their sizes. Initial spotting is started to find a query word among those groups of characters. In case if the system could spot a word partially due to any noise, SIFT is applied to identify missing portion of that partial spotting. Experimental results on English and Bangla script document images show that the method is feasible to spot a location in text labeled graphical documents.
  • Keywords
    Gabor filters; document image processing; image classification; image retrieval; image segmentation; natural language processing; optical character recognition; support vector machines; text analysis; transforms; Bangla graphical documents; Bangla scripts; English graphical documents; English scripts; Gabor filter; SIFT; SVM classifier; character segmentation approach; electronic media usage; isolated character recognition; object searching; query word; rotation invariant feature; scale invariant feature transform; support vector machine; text labeled graphical documents; text-graphics layers; word spotting system; Character recognition; Feature extraction; Gabor filters; Graphics; Reservoirs; Support vector machines; Clustering; Document Image Analysis; Gabor Filter; Graphical documents; Information Retrieval; SIFT feature; Water Reservoir Principle; Word Spotting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ICPR), 2014 22nd International Conference on
  • Conference_Location
    Stockholm
  • ISSN
    1051-4651
  • Type

    conf

  • DOI
    10.1109/ICPR.2014.525
  • Filename
    6977237