Title :
A robust and efficient algorithm for bilevel document block classification
Author :
Pappas, Thrasyvoulos N. ; Tseng, Snow H. ; Kosiba, David A.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Northwestern Univ., Evanston, IL, USA
fDate :
6/23/1905 12:00:00 AM
Abstract :
We present a robust and computationally efficient algorithm for the classification of blocks of bilevel machine-printed documents into text and halftone categories. It uses a simple mask that makes use of the different correlation properties between the text and halftone regions, and has comparable or better performance than more sophisticated and computationally intensive spectral analysis techniques. The proposed algorithm is a key component of a document recognition system that segments a document into regions, classifies them into text, halftone, line-art, etc., and then analyzes the regions to obtain a document interpretation. The input data are unusually challenging: multilingual, unoriented (e.g., upside down), and range from ideal (machine-generated) images to very low quality (e.g., copied and faxed) images. We test the proposed algorithm on the University of Washington database and demonstrate its performance on a variety of images from different databases, as well as synthetic images
Keywords :
document image processing; image classification; image segmentation; bilevel document block classification; correlation properties; halftone regions; image quality; machine-printed documents; text regions; Algorithm design and analysis; Business; Classification algorithms; Image databases; Image generation; Image segmentation; Robustness; Snow; Spectral analysis; Testing;
Conference_Titel :
Image Processing, 2001. Proceedings. 2001 International Conference on
Conference_Location :
Thessaloniki
Print_ISBN :
0-7803-6725-1
DOI :
10.1109/ICIP.2001.959248