• DocumentCode
    2079367
  • Title

    Document image understanding: geometric and logical layout

  • Author

    Haralick, Robert M.

  • Author_Institution
    Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA
  • fYear
    1994
  • fDate
    21-23 Jun 1994
  • Firstpage
    385
  • Lastpage
    390
  • Abstract
    Document image understanding encompasses the technology required to make paper documents equivalent to other computer exchange media like floppies, tapes, and CDROMs. The physical reader of the paper document is the scanner just like the physical reader of the floppy is the floppy drive and the physical reader of the tape cartridge is the tape cartridge drive, and the physical reader of the CDROM is the CDROM drive. In the survey presented, we restrict ourselves to documents such as business letters, forms, and scientific and technical articles such as those found in archival journals and technical conferences. Understanding such documents involves estimating the rotation skew of each document page, determining the geometric page layout, labeling blocks as text or non-text, determining the read order for text blocks, recognizing the text of text blocks through an OCR system, determining the logical page layout, and formatting the data and information of the document in a suitable way for use by a word processing system or by an information retrieval system
  • Keywords
    character recognition; computational geometry; document handling; document image processing; image processing; word processing; CDROMs; OCR system; business letters; computer exchange media; data formatting; document image understanding; document page; floppies; geometric page layout; information retrieval system; logical layout; logical page layout; non-text; read order; rotation skew; scanner; technical articles; text blocks; word processing system; Character recognition; Computational geometry; Document handling; Image processing; Text processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition, 1994. Proceedings CVPR '94., 1994 IEEE Computer Society Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1063-6919
  • Print_ISBN
    0-8186-5825-8
  • Type

    conf

  • DOI
    10.1109/CVPR.1994.323855
  • Filename
    323855