Title :
Knowledge acquisition and representation for document structure recognition: The CAROL Project
Author :
Schmidt, Jörg ; Putz, Wolfgang
Author_Institution :
Gesellschaft fuer Math. und Datenverarbeitung mbH (GMD), Darmstadt, Germany
Abstract :
The authors describe a rule based recognition system to rebuild the structure of paper documents. This method is applied to an automatic cataloging system to be used in libraries. Documents are scanned and run through a character recognition engine. The result of the character recognition process is an output format with additional layout information serving as input for the rule interpreter. Rules for a specific document type are generated by a learning module, which enables the user to create a set of rules for a new document type. The learning component uses several generalization rules which can also be found in machine learning systems. CAROL, a demonstration prototype, is currently being tested by librarians
Keywords :
cataloguing; document handling; document image processing; expert systems; generalisation (artificial intelligence); knowledge acquisition; knowledge representation; learning (artificial intelligence); library automation; optical character recognition; CAROL Project; automatic cataloging system; character recognition engine; document structure recognition; generalization rules; knowledge acquisition; knowledge representation; learning module; libraries; machine learning systems; output format; rule based recognition system; rule interpreter; Acceleration; Artificial intelligence; Character recognition; Engines; Knowledge acquisition; Learning systems; Libraries; Optical character recognition software; Prototypes; Testing;
Conference_Titel :
Artificial Intelligence for Applications, 1993. Proceedings., Ninth Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
0-8186-3840-0
DOI :
10.1109/CAIA.1993.366644