• DocumentCode
    2362967
  • Title

    Knowledge acquisition and representation for document structure recognition: The CAROL Project

  • Author

    Schmidt, Jörg ; Putz, Wolfgang

  • Author_Institution
    Gesellschaft fuer Math. und Datenverarbeitung mbH (GMD), Darmstadt, Germany
  • fYear
    1993
  • fDate
    1-5 Mar 1993
  • Firstpage
    177
  • Lastpage
    181
  • Abstract
    The authors describe a rule based recognition system to rebuild the structure of paper documents. This method is applied to an automatic cataloging system to be used in libraries. Documents are scanned and run through a character recognition engine. The result of the character recognition process is an output format with additional layout information serving as input for the rule interpreter. Rules for a specific document type are generated by a learning module, which enables the user to create a set of rules for a new document type. The learning component uses several generalization rules which can also be found in machine learning systems. CAROL, a demonstration prototype, is currently being tested by librarians
  • Keywords
    cataloguing; document handling; document image processing; expert systems; generalisation (artificial intelligence); knowledge acquisition; knowledge representation; learning (artificial intelligence); library automation; optical character recognition; CAROL Project; automatic cataloging system; character recognition engine; document structure recognition; generalization rules; knowledge acquisition; knowledge representation; learning module; libraries; machine learning systems; output format; rule based recognition system; rule interpreter; Acceleration; Artificial intelligence; Character recognition; Engines; Knowledge acquisition; Learning systems; Libraries; Optical character recognition software; Prototypes; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Artificial Intelligence for Applications, 1993. Proceedings., Ninth Conference on
  • Conference_Location
    Orlando, FL
  • Print_ISBN
    0-8186-3840-0
  • Type

    conf

  • DOI
    10.1109/CAIA.1993.366644
  • Filename
    366644