DocumentCode :
1994302
Title :
Word searching in CCITT group 4 compressed document images
Author :
Lu, Yue ; Tan, Chew Lim
Author_Institution :
Comput. Sci. Dept., Singapore Nat. Univ., Kent Ridge, Singapore
fYear :
2003
fDate :
3-6 Aug. 2003
Firstpage :
467
Abstract :
In this paper, we present a compressed pattern matching method for searching user queried words in the CCITT Group 4 compressed document images, without decompressing. The feature pixels composed of black changing elements and white changing elements are extracted directly from the CCITT Group 4 compressed document images. The connected components are labeled based on a line-by-line strategy according to the relative positions between the changing elements of the current coding line and the changing elements of the reference line. Word boxes are bounded by merging the connected components. A two-stage matching strategy is constructed to measure the dissimilarity between the template image of the user´s query word and the words extracted from document images. Experimental results confirmed the validity of the proposed approach.
Keywords :
character recognition; document image processing; image coding; image matching; black changing elements; coding line; compressed document images; compressed pattern matching; line-by-line strategy; reference line; user queried words; white changing elements; word boxes; word searching; Computer science; Image coding; Image recognition; Image storage; Internet; Merging; Optical character recognition software; Pattern matching; Pixel; Software libraries;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN :
0-7695-1960-1
Type :
conf
DOI :
10.1109/ICDAR.2003.1227709
Filename :
1227709
Link To Document :
بازگشت