DocumentCode :
2840063
Title :
Knowledge-based derivation of document logical structure
Author :
Niyogi, Debashish ; Srihari, Sargur N.
Author_Institution :
CEDAR, State Univ. of New York, Buffalo, NY, USA
Volume :
1
fYear :
1995
fDate :
14-16 Aug 1995
Firstpage :
472
Abstract :
The analysis of a document image to derive a symbolic description of its structure and contents involves using spatial domain knowledge to classify the different printed blocks (e.g., text paragraphs), group them into logical units (e.g., newspaper stories), and determine the reading order of the text blocks within each unit. These steps describe the conversion of the physical structure of a document into its logical structure. We have developed a computational model for document logical structure derivation, in which a rule-based control strategy utilizes the data obtained from analyzing a digitized document image, and makes inferences using a multi-level knowledge base of document layout rules. The knowledge-based document logical structure derivation system (DeLoS) based on this model consists of a hierarchical rule-based control system to guide the block classification, grouping and read-ordering operations; a global data structure to store the document image data and incremental inferences; and a domain knowledge base to encode the rules governing document layout
Keywords :
document image processing; inference mechanisms; knowledge based systems; DeLoS; block classification; document image; document image data; document layout; document layout rules; document logical structure; document logical structure derivation; grouping; incremental inferences; inferences; knowledge-based; multi-level knowledge base; read-ordering operations; rule-based control strategy; spatial domain knowledge; symbolic description; Computational modeling; Control system synthesis; Data analysis; Data structures; Image analysis; Image converters; Image recognition; Solid modeling; Text analysis; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
Type :
conf
DOI :
10.1109/ICDAR.1995.599038
Filename :
599038
Link To Document :
بازگشت