• DocumentCode
    3023272
  • Title

    Intelligent document processing

  • Author

    Esposito, Floriana ; Ferilli, Stefano ; Basile, Teresa M A ; Mauro, Nicola Di

  • Author_Institution
    Dept. of Comput. Sci., Bari Univ., Italy
  • fYear
    2005
  • fDate
    29 Aug.-1 Sept. 2005
  • Firstpage
    1100
  • Abstract
    Digital repositories raise the need for an effective and efficient retrieval of the stored material. In this paper, we propose the intensive application of intelligent techniques to the steps of document layout analysis, document image classification and understanding on digital documents. Specifically, the complex interrelation existing among layout components, that are fundamental to assign them the proper semantic role, suggest the exploitation of first-order representations in some learning steps. Results obtained in a prototypical system for scientific conference management prove that the proposed approach can be beneficial both for the layout recognition and for the selection of interesting components of the document, from which extracting the text for categorizing the document according to its topic.
  • Keywords
    document image processing; image classification; information retrieval; digital documents; digital repositories; document image classification; document layout analysis; document layout recognition; first-order representations; intelligent document processing; scientific conference management; stored material retrieval; Application software; Computer science; Conference management; Image analysis; Image classification; Iterative algorithms; Machine learning; Page description languages; Prototypes; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
  • ISSN
    1520-5263
  • Print_ISBN
    0-7695-2420-6
  • Type

    conf

  • DOI
    10.1109/ICDAR.2005.144
  • Filename
    1575714