• DocumentCode
    2015117
  • Title

    Keyword Spotting and Retrieval of Document Images Captured by a Digital Camera

  • Author

    Lu, Shijian ; Tan, Chew Lim

  • Author_Institution
    Nat. Univ. of Singapore, Singapore
  • Volume
    2
  • fYear
    2007
  • fDate
    23-26 Sept. 2007
  • Firstpage
    994
  • Lastpage
    998
  • Abstract
    This paper presents a keyword spotting technique that locates keywords within document images captured by a digital camera. In the proposed technique, the shape of word images in perspective view is captured by using three perspective invariants, namely, holes, water reservoirs, and character ascenders and descenders. Given a camera image of document, text line and word images are first segmented through the connected component analysis. The three perspective invariants are then detected through two rounds of scanning process, which transliterate each character image into a character shape code of dimension six and so convert each word image into a word shape code. Keywords within camera images of documents are finally located through a partial matching process. Experiments show some promising results.
  • Keywords
    image sensors; optical character recognition; character image; character shape code; digital camera; images captured document; keyword spotting technique; partial matching process; scanning process; text line; word images; Digital cameras; Image analysis; Image coding; Image converters; Image retrieval; Image segmentation; Optical character recognition software; Reservoirs; Shape; Water resources;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
  • Conference_Location
    Parana
  • ISSN
    1520-5363
  • Print_ISBN
    978-0-7695-2822-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.2007.4377064
  • Filename
    4377064