DocumentCode :
2900128
Title :
Document page segmentation and layout analysis using soft ordering
Author :
Mitchell, Phillip E. ; Yan, Hong
Author_Institution :
Sch. of Electr. & Inf. Eng., Sydney Univ., NSW, Australia
Volume :
1
fYear :
2000
fDate :
2000
Firstpage :
458
Abstract :
This paper presents a novel algorithm for layout analysis of document images. A major component of this algorithm is the independent segmentation algorithm that identifies text and graphics regions. The segmentation algorithm first locates document patterns and then performs classification using run-length characteristics, spread analysis and adjacency relations. A key feature of the layout analysis algorithm is soft ordering which provides a means of ordering regions in a more logical way, and allows for some overlapping between separate regions. This is very useful for processing documents that are slightly skewed or irregular in layout. The algorithm has been tested on many different documents, and can successfully recognise single and multicolumn documents, even when the column format varies several times on one page. Furthermore, it can process documents with text tightly wrapped around graphics and documents that are slightly skewed
Keywords :
document image processing; image segmentation; adjacency relations; document images; document layout analysis; document page segmentation; graphics; graphics region identification; independent segmentation algorithm; multicolumn documents; region overlap; run-length characteristics; skewed documents; soft ordering; spread analysis; text region identification; Algorithm design and analysis; Graphics; Image analysis; Image segmentation; Independent component analysis; Layout; Pattern analysis; Performance analysis; Testing; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location :
Barcelona
ISSN :
1051-4651
Print_ISBN :
0-7695-0750-6
Type :
conf
DOI :
10.1109/ICPR.2000.905375
Filename :
905375
Link To Document :
بازگشت