DocumentCode :
2870938
Title :
A robust document processing system combining image segmentation with content-based document compression
Author :
Yang, Yibing ; Yan, Hong
Author_Institution :
Sch. of Electr. & Inf. Eng., Sydney Univ., NSW, Australia
Volume :
4
fYear :
2000
fDate :
2000
Firstpage :
519
Abstract :
A document processing system combining image segmentation with content-based document compression is proposed in the paper. Firstly, a grayscale document image is divided into small blocks and analysed. Then, a modified logical thresholding method based on, local structure analysis and the adaptive logical level technique is used to transform the grayscale document into a binary image. We extract all patterns from the binary document and use a multistage matching method to extract representative patterns. A decomposition method is used to deal with relatively large patterns. Finally, high ratio compression is achieved by coding the relative positions of symbols, extracted representative patterns and other decomposed patterns using the adaptive arithmetic coder anal Q-Coder respectively
Keywords :
data compression; document image processing; image coding; image segmentation; Q-Coder respectively; adaptive arithmetic coder; adaptive logical level technique; binary image; content-based document compression; decomposition method; high ratio compression; local structure analysis; modified logical thresholding method; multistage matching method; robust document processing system; Arithmetic; Data mining; Gray-scale; Image analysis; Image coding; Image segmentation; Information analysis; Pattern matching; Robustness; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location :
Barcelona
ISSN :
1051-4651
Print_ISBN :
0-7695-0750-6
Type :
conf
DOI :
10.1109/ICPR.2000.902971
Filename :
902971
Link To Document :
بازگشت