DocumentCode :
2180116
Title :
Supporting information extraction from printed documents by Lexico-Semantic pattern matching
Author :
Wenzel, Claudia
Author_Institution :
German Res. Center for Artificial Intelligence, Kaiserslautern, Germany
Volume :
2
fYear :
1997
fDate :
18-20 Aug 1997
Firstpage :
732
Abstract :
Document analysis and understanding (DAU) systems aim not only at the recognition of text and document structures but also at the extraction of relevant information out of a scanned document. Depending on the class of a document, information to be extracted may be defined in advance in syntactic structures as well as in semantic structures. In this paper we present a system for detecting such information and transforming it into a semantic representation. The basic component is a pattern matcher which incorporates geometric positions to detect phrases in the document. By defining a Levenshtein distance, the component reacts more generously in order to be error tolerant against OCR failures
Keywords :
image recognition; information retrieval; knowledge acquisition; pattern matching; Levenshtein distance; Lexico-Semantic pattern matching; document analysis and understanding systems; document structures; information extraction; printed documents; semantic representation; semantic structures; text recognition; Artificial intelligence; Data mining; Information analysis; Optical character recognition software; Pattern analysis; Pattern matching; Pattern recognition; Text analysis; Text recognition; Workflow management software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
Type :
conf
DOI :
10.1109/ICDAR.1997.620605
Filename :
620605
Link To Document :
بازگشت