Title :
A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents
Author :
Lebourgeois, F. ; Bublinski ; Emptoz, H.
Author_Institution :
Lab. de Modelisation des Syst. et Reconnaissance de Formes, INSA de Lyon, Villeurbanne, France
fDate :
30 Aug-3 Sep 1992
Abstract :
Outlines a fast and efficient method for extracting graphics and text paragraphs from printed documents. The method presented is based on bottom-up approach to document analysis and it achieves very good performance in most cases. During the preprocessing characters are linked together to form blocks. Created blocks are segmented, labelled and merged into paragraphs. Simultaneously, graphics are extracted from the image. Algorithms for each step of processing are presented. Also, the obtained experimental results are included
Keywords :
document image processing; image segmentation; text editing; document analysis; document processing; graphics extraction; labelling; run length smoothing algorithm; segmentation; text paragraph extraction; unconstrained documents; Data mining; Graphics; Image analysis; Image segmentation; Joining processes; Performance analysis; Pixel; Reconnaissance; Smoothing methods; Text analysis;
Conference_Titel :
Pattern Recognition, 1992. Vol.II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on
Conference_Location :
The Hague
Print_ISBN :
0-8186-2915-0
DOI :
10.1109/ICPR.1992.201771