DocumentCode
3307580
Title
Page segmentation and classification utilising a bottom-up approach
Author
Drivas, Dimitrios ; Amin, Adnan
Author_Institution
Sch. of Comput. Sci. & Eng., New South Wales Univ., Kensington, NSW, Australia
Volume
2
fYear
1995
fDate
14-16 Aug 1995
Firstpage
610
Abstract
This paper presents the use of analysing the connected components extracted from the binary image of a document page. Such an analysis provides a lot of useful information, and will be used to perform skew correction, segmentation and classification of the document. We present a new algorithm for determining the skew angle of lines of text in an image of a document with the advantage that it only performs one iteration to determine the skew angle. Experiments on over 30 pages show that the method works well on a wide variety of layouts, including sparse textual regions, mixed fonts, multiple columns, and even for documents with a high graphical content
Keywords
character sets; document image processing; image classification; image segmentation; optical character recognition; OCR; binary image analysis; bottom-up approach; document image processing; document page analysis; experiments; graphics; layouts; mixed fonts; multiple columns; page classification; page segmentation; skew angle; skew correction; sparse textual regions; text; Australia; Computer science; Data mining; Detection algorithms; Graphics; Image analysis; Image segmentation; Information analysis; Layout; Optical character recognition software;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location
Montreal, Que.
Print_ISBN
0-8186-7128-9
Type
conf
DOI
10.1109/ICDAR.1995.601970
Filename
601970
Link To Document