DocumentCode :
2629211
Title :
Weak model-dependent page segmentation and skew correction for processing document images
Author :
Cullen, John F. ; Ejiri, Koichi
Author_Institution :
Ricoh California Res. Center, Menlo Park, CA, USA
fYear :
1993
fDate :
20-22 Oct 1993
Firstpage :
757
Lastpage :
760
Abstract :
Presents an algorithm for fast accurate page segmentation that is as far as possible independent of a model for the document page. The method is based on a reduced image data representation that uses bounding rectangles and run length size distributions contained within the bounding rectangles. These rectangles are the basis for the method of skew detection, column identification and merging of text into blocks, and they help achieve accurate page segmentation. The reduced complexity that rectangles offer insures a fast processing time. The method is applied to a broad range of documents found in a typical office environment. Documents are scanned at 400 dpi and stored as binary images
Keywords :
data reduction; data structures; document image processing; image segmentation; merging; binary images; bounding rectangles; column identification; document image processing; fast processing time; office document scanning; reduced complexity; reduced image data representation; run length size distributions; skew correction; skew detection; text blocks; text merging; weak model dependant page segmentation; Data mining; Digital images; Graphics; Image databases; Image segmentation; Optical character recognition software; Partitioning algorithms; Pixel; Text recognition; Tiles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
Type :
conf
DOI :
10.1109/ICDAR.1993.395627
Filename :
395627
Link To Document :
بازگشت