• DocumentCode
    3073744
  • Title

    Retrieval Of Information In Document Image Databases Using Partial Word Image Matching Technique

  • Author

    Yadav, Seema ; Sawarkar, Sudhir

  • Author_Institution
    MGM Coll. of Eng. Kalamboli, Datta Meghe COE, Airoli
  • fYear
    2009
  • fDate
    6-7 March 2009
  • Firstpage
    552
  • Lastpage
    557
  • Abstract
    With the popularity and importance of document images as an information source, information retrieval in document image databases has become a challenge. In this paper, an approach with the capability of matching partial word images to address two issues in document image retrieval: word spotting and similarity measurement between documents has been proposed. Initially, each word image is represented by a primitive string. Then, an inexact string matching technique is utilized to measure the similarity between the string generated of the query word with the word string generated from the document. Based on the similarity, we can find out how a word image is relevant to the other and, can be decided whether one is a portion of the other. In order to deal with various character fonts, a primitive string which is tolerant to serif and font differences to represent a word image has been used. Using this technique of inexact string matching, our method is able to successfully handle the problem of heavily touching characters. From the experimental results on a variety of document image databases it is confirmed that the proposed approach is feasible, valid, and efficient in document image retrieval.
  • Keywords
    document image processing; image matching; information retrieval; string matching; visual databases; document image databases; heavily touching characters; inexact string matching technique; information retrieval; partial word image matching technique; query word; similarity measurement; word spotting; word string; Electronics packaging; Image converters; Image databases; Image matching; Image retrieval; Image storage; Information retrieval; Internet; Optical character recognition software; Software libraries; Document image retrieval; partial word image matching; primitive string; word searching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advance Computing Conference, 2009. IACC 2009. IEEE International
  • Conference_Location
    Patiala
  • Print_ISBN
    978-1-4244-2927-1
  • Electronic_ISBN
    978-1-4244-2928-8
  • Type

    conf

  • DOI
    10.1109/IADCC.2009.4809071
  • Filename
    4809071