DocumentCode :
3058186
Title :
Page segmentation without rectangle assumption
Author :
Saitoh, Takashi ; Pavlidis, Theo
Author_Institution :
Ricoh R&D Center, Kanagawa, Japan
fYear :
1992
fDate :
30 Aug-3 Sep 1992
Firstpage :
277
Lastpage :
280
Abstract :
A new technique for page segmentation without skew normalization is described and applied to both English and Japanese complex printed-page layouts. There is no need to make any assumption about the shape of blocks, hence the technique can handle not only skewed pages but it can also be extended to handle documents where columns are not rectangles. In this technique, based on the bottom-up strategy, the connected components are extracted on the reduced image and are classified with their local information. Since the skew angle is also estimated with the local information of blocks, the computational time is very short. Merging text blocks into string lines and into columns is performed with the skew information
Keywords :
document image processing; image segmentation; optical character recognition; OCR preprocessing; block extraction; bottom-up strategy; page segmentation; printed-page layouts; skew angle; skewed pages; string lines; Aggregates; Computer science; Data mining; Image segmentation; Optical character recognition software; Pixel; Research and development; Shape; Streaming media; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 1992. Vol.II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on
Conference_Location :
The Hague
Print_ISBN :
0-8186-2915-0
Type :
conf
DOI :
10.1109/ICPR.1992.201772
Filename :
201772
Link To Document :
بازگشت