Title :
Graph b-Coloring for Automatic Recognition of Documents
Author :
Gaceb, Djamel ; Eglin, Véronique ; Lebourgeois, Frank ; Emptoz, Hubert
Author_Institution :
LIRIS, INSA deLyon, Villeurbanne, France
Abstract :
In order to reduce the rejection rate of our automatic reading system, we propose to pre-classify the business documents by introducing an automatic recognition of documents stage (ARD) as a pre-processing step. This important step will guide the other stages involved in the recognition process of the documents contents. Once the document class identified, the reading system will use correct information from the ARD stage to improve the segmentation of the layout, the recognition of the document structure, the parameterization of the OCR, and the final decision for the rejection. We propose in this paper an original method for the classification of business documents suited for complex layouts having great variability. We introduce the graph coloring approach for both layout analysis and document classification. The proposed method is reliable, robust to various constraints and guarantees a real-time answer to the sorting of business documents.
Keywords :
document image processing; graph colouring; image classification; image segmentation; optical character recognition; ARD; OCR; automatic document recognition; business document classification; graph b-coloring; image segmentation; Classification tree analysis; Dictionaries; Dynamic programming; Image analysis; Image recognition; Information analysis; Optical character recognition software; Robustness; Sorting; Text analysis; Automatic recognition of documents; Classification of business documents; Document sorting; Graph b-Coloring; Segmentation of the layout;
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2009.72