DocumentCode :
3384416
Title :
Keyword searching in compressed document images
Author :
Lu, Yue ; Tan, Chew Lim
Author_Institution :
Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore
fYear :
2003
fDate :
25-27 March 2003
Firstpage :
437
Abstract :
Summary form only given. A compressed pattern matching method for searching keywords from the CCIT group 4-compressed document images, without explicit decompression, is presented. According to the CCIT Group 4 standards, each coded position indicates current pixel color is different from its previous pixel, except for the next coded positions of the pass mode. The changing elements from the compressed images are extracted and are then utilized to segment and bound the word objects and to measure the similarity of two word images. A two-stage matching strategy is constructed to measure the dissimilarity between the template image of the user´s query word and the word extracted from document images. Experiments were conducted to verify the validity of the approach. The results show that the proposed approach was much faster than the traditional approach, because it avoids the pixel-level processing for analyzing the connected components and extracting word features.
Keywords :
data compression; digital libraries; document image processing; image coding; image matching; query formulation; visual databases; CCIT group 4 standards; Hausdorff distance; Internet; PDF files; coarse-matching procedure; compressed document images; compressed pattern matching; digital libraries; explicit decompression; keyword searching; pass mode; pixel color; pixel-level processing; two-stage matching strategy; user query words; word objects; Books; Code standards; Computer science; Image coding; Image segmentation; Internet; Keyword search; Pattern matching; Pixel; Software libraries;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2003. Proceedings. DCC 2003
ISSN :
1068-0314
Print_ISBN :
0-7695-1896-6
Type :
conf
DOI :
10.1109/DCC.2003.1194056
Filename :
1194056
Link To Document :
بازگشت