Segmentation of Text and Graphics from Document Images

Author

Chowdhury, S.P. ; Mandal, S. ; Das, A.K. ; Chanda, Bhabatosh

Author_Institution

B. E. & Sc. Univ., Durgapur

Volume

2

fYear

2007

fDate

23-26 Sept. 2007

Firstpage

619

Lastpage

623

Abstract

Text, graphics and half-tones are the major constituents of any document page. While half-tone can be characterised by its inherent intensity variation, text and graphics share common characteristics except difference in spatial distribution. The success of document image analysis systems depends on the proper segmentation. The success of document image analysis systems depends on the proper segmentation of text and graphics as text is further subdivided into other classes such as heading, table and math-zones. Segmentation of graphics is essential for better OCR performance and vectorization in computer vision applications. Graphics segmentation from text is particularly difficult in the context of graphics made of small components (dashed or dotted lines etc.) which have many features similar to texts. Here we propose a robust technique for segmenting all sorts of graphics and texts in any orientation from document pages.

Keywords

computer graphics; computer vision; document image processing; image segmentation; optical character recognition; text analysis; OCR; computer vision; document image analysis system; graphics segmentation; text segmentation; Computer graphics; Computer vision; Engineering drawings; Filtering; Filters; Image analysis; Image segmentation; Optical character recognition software; Robustness; Text analysis;

fLanguage

English

Publisher

ieee

Conference_Titel

Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on

Conference_Location

Parana

ISSN

1520-5363

Print_ISBN

978-0-7695-2822-9

Type

conf

DOI

10.1109/ICDAR.2007.4376989

Filename

4376989