• DocumentCode
    2057505
  • Title

    Modeling documents for structure recognition using generalized N-grams

  • Author

    Brugger, R. ; Zramdini, A. ; Ingold, R.

  • Author_Institution
    Inst. de Inf., Fribourg Univ., Switzerland
  • Volume
    1
  • fYear
    1997
  • fDate
    18-20 Aug 1997
  • Firstpage
    56
  • Abstract
    We present and discuss a novel approach to modeling logical structures of documents, based on a statistical representation of patterns in a document class. An efficient and error tolerant recognition heuristics adapted to the model is proposed. The statistical approach permits easily automated and incremental learning of the model. The approach has been partially evaluated on a prototype. A discussion of the results achieved by the prototype is finally made
  • Keywords
    document image processing; image recognition; software fault tolerance; statistical analysis; trees (mathematics); document class; document modelling; error tolerant recognition heuristics; generalized N-grams; incremental learning; logical structures; statistical approach; statistical pattern representation; structure recognition; Application software; Decision trees; Error correction; Humans; Knowledge based systems; Optical character recognition software; Prototypes; Software prototyping; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
  • Conference_Location
    Ulm
  • Print_ISBN
    0-8186-7898-4
  • Type

    conf

  • DOI
    10.1109/ICDAR.1997.619813
  • Filename
    619813