Title :
High Efficient Compression Strategy for Scanned Receipts and Handwritten Documents
Author :
Xu Danhua ; Bao Xudong
Author_Institution :
Sch. of Comput. & Software, Nanjing Univ. of Inf. Sci. & Technol., Nanjing, China
Abstract :
Image compression is one of the traditional topics in image processing and has been widely discussed and applied. Some standards, such as, JPEG and JPEG 2000, have also been published for the applications dealing with gray or color photos and medical images. However, for some specific applications, such as, electronic financial management systems (eFMS), much higher efficient algorithms have to be designed for the compression of receipts or handwritten documents. A new strategy is discussed for the compression based on the separation of foreground and background according to the assumption that less degradation of foreground is allowed because of the most important information represented, while more degradation of background is acceptable because it only provides the sense of reality of the document. The image is firstly transformed to YCbCr color space to separate intensities from tones. Then, foreground and background are extracted from the intensity subimage with median filter. Both foreground and background are down-sampled and respectively clustered based on the gray histograms. The chromatic aberration subimages are also down-sampled and transformed to palette-index model by the clustering based on the 2D histogram. All clustered subimages are encoded with JPEG introduced RLE algorithm and synthesized finally. The results demonstrated much higher compression rates of presented strategy than that of JPEG standard.
Keywords :
data compression; document image processing; handwriting recognition; handwritten character recognition; image coding; 2D histogram; JPEG 2000; JPEG standard; chromatic aberration subimages; electronic financial management systems; gray histograms; handwritten documents; image compression; image processing; median filter; medical images; palette-index model; scanned receipts; Algorithm design and analysis; Biomedical imaging; Data mining; Degradation; Financial management; Histograms; Image coding; Image processing; Standards publication; Transform coding;
Conference_Titel :
Information Science and Engineering (ICISE), 2009 1st International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4244-4909-5
DOI :
10.1109/ICISE.2009.632