• DocumentCode
    436553
  • Title

    Filtering in Chinese document images based on templates and confidence measure

  • Author

    Jiewei, Chen ; Weiran, Xu ; Jun, Guo

  • Author_Institution
    Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., China
  • Volume
    2
  • fYear
    2004
  • fDate
    31 Aug.-4 Sept. 2004
  • Firstpage
    1376
  • Abstract
    A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach. Experimental results confirmed the robust of the proposed approach over a wide range of degradations.
  • Keywords
    character recognition; document image processing; feature extraction; image matching; image retrieval; information filtering; natural languages; Boyer-Moore algorithm; Chinese document image filter; candidate character; confidence measure; information filtering; keyword lexicon; multiple template matching; two-stage feature vector; two-stage retrieval scheme; Acceleration; Character recognition; Degradation; Image recognition; Image retrieval; Image segmentation; Information filtering; Information retrieval; Optical character recognition software; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing, 2004. Proceedings. ICSP '04. 2004 7th International Conference on
  • Print_ISBN
    0-7803-8406-7
  • Type

    conf

  • DOI
    10.1109/ICOSP.2004.1441582
  • Filename
    1441582