DocumentCode
1479198
Title
Scanned Compound Document Encoding Using Multiscale Recurrent Patterns
Author
Francisco, Nelson C. ; Rodrigues, Nuno M M ; da Silva, E.A.B. ; De Carvalho, Murilo Bresciani ; De Faria, Sérgio M M ; Silva, Eduardo A B da
Author_Institution
Inst. de Telecomun., Leiria, Portugal
Volume
19
Issue
10
fYear
2010
Firstpage
2712
Lastpage
2724
Abstract
In this paper, we propose a new encoder for scanned compound documents, based upon a recently introduced coding paradigm called multidimensional multiscale parser (MMP). MMP uses approximate pattern matching, with adaptive multiscale dictionaries that contain concatenations of scaled versions of previously encoded image blocks. These features give MMP the ability to adjust to the input image´s characteristics, resulting in high coding efficiencies for a wide range of image types. This versatility makes MMP a good candidate for compound digital document encoding. The proposed algorithm first classifies the image blocks as smooth (texture) and nonsmooth (text and graphics). Smooth and nonsmooth blocks are then compressed using different MMP-based encoders, adapted for encoding either type of blocks. The adaptive use of these two types of encoders resulted in performance gains over the original MMP algorithm, further increasing the performance advantage over the current state-of-the-art image encoders for scanned compound images, without compromising the performance for other image types.
Keywords
document image processing; encoding; grammars; image coding; adaptive multiscale dictionaries; approximate pattern matching; concatenations; encoded image blocks; multidimensional multiscale parser; multiscale recurrent patterns; scanned compound document encoding; Adaptive pattern matching; compound images; dictionary based coding; image coding; scanned document compression; vector quantization;
fLanguage
English
Journal_Title
Image Processing, IEEE Transactions on
Publisher
ieee
ISSN
1057-7149
Type
jour
DOI
10.1109/TIP.2010.2049181
Filename
5454328
Link To Document