DocumentCode
436553
Title
Filtering in Chinese document images based on templates and confidence measure
Author
Jiewei, Chen ; Weiran, Xu ; Jun, Guo
Author_Institution
Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., China
Volume
2
fYear
2004
fDate
31 Aug.-4 Sept. 2004
Firstpage
1376
Abstract
A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach. Experimental results confirmed the robust of the proposed approach over a wide range of degradations.
Keywords
character recognition; document image processing; feature extraction; image matching; image retrieval; information filtering; natural languages; Boyer-Moore algorithm; Chinese document image filter; candidate character; confidence measure; information filtering; keyword lexicon; multiple template matching; two-stage feature vector; two-stage retrieval scheme; Acceleration; Character recognition; Degradation; Image recognition; Image retrieval; Image segmentation; Information filtering; Information retrieval; Optical character recognition software; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing, 2004. Proceedings. ICSP '04. 2004 7th International Conference on
Print_ISBN
0-7803-8406-7
Type
conf
DOI
10.1109/ICOSP.2004.1441582
Filename
1441582
Link To Document