DocumentCode :
3307580
Title :
Page segmentation and classification utilising a bottom-up approach
Author :
Drivas, Dimitrios ; Amin, Adnan
Author_Institution :
Sch. of Comput. Sci. & Eng., New South Wales Univ., Kensington, NSW, Australia
Volume :
2
fYear :
1995
fDate :
14-16 Aug 1995
Firstpage :
610
Abstract :
This paper presents the use of analysing the connected components extracted from the binary image of a document page. Such an analysis provides a lot of useful information, and will be used to perform skew correction, segmentation and classification of the document. We present a new algorithm for determining the skew angle of lines of text in an image of a document with the advantage that it only performs one iteration to determine the skew angle. Experiments on over 30 pages show that the method works well on a wide variety of layouts, including sparse textual regions, mixed fonts, multiple columns, and even for documents with a high graphical content
Keywords :
character sets; document image processing; image classification; image segmentation; optical character recognition; OCR; binary image analysis; bottom-up approach; document image processing; document page analysis; experiments; graphics; layouts; mixed fonts; multiple columns; page classification; page segmentation; skew angle; skew correction; sparse textual regions; text; Australia; Computer science; Data mining; Detection algorithms; Graphics; Image analysis; Image segmentation; Information analysis; Layout; Optical character recognition software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
Type :
conf
DOI :
10.1109/ICDAR.1995.601970
Filename :
601970
Link To Document :
بازگشت