Title :
From acquisition to modelisation of a form base to retrieve information
Author :
Diana, S. ; Trupin, E. ; Jouzel, F. ; Lecourtier, Y. ; Labiche, J.
Author_Institution :
Rouen Univ., Mont-Saint-Aignan, France
Abstract :
This article deals with the description of a module for information structure extraction. This module is applied to forms which are used by the CAF, the French National Family Allowance Department, Caisse d´Allocations Familiales. The aim of this module is to create a base of information structures (models) defining their forms and their content so as to be able to deal with them automatically. The module deals with the various stages of form treatment, from acquisition to modelisation. It is composed of three different stages. The first corresponds to low-level processing-i.e., binarisation, skew correction. The second extracts the informative features contained in the forms. The last one organises the different features to obtain form modelisation thanks to a hierarchical structure. The creation of this base of information structures will be used for a second module for type form identification based on the comparison of these information structures
Keywords :
business forms; document image processing; feature extraction; financial data processing; information retrieval; public administration; visual databases; National Family Allowance Department; binarisation; document processing; feature extraction; form base acquisition; forms; hierarchical structure; information retrieval; information structure extraction; low-level processing; skew correction; type form identification; Costs; Data mining; Documentation; Feature extraction; Information analysis; Information management; Information retrieval; Storage automation; Text analysis;
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
DOI :
10.1109/ICDAR.1997.620612