• Title of article

    Using Shape and Layout Information to Find Signatures, Text, and Graphics

  • Author/Authors

    Hobby، John D. نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2000
  • Pages
    -87
  • From page
    88
  • To page
    0
  • Abstract
    The decomposition of a page image into text and various types of nontext elements is a challenging problem important in document analysis problems such as optical character recognition, storage and retrieval, and identification of the sender and recipient of a FAX. A fast classifier based on a skeletonization of the image attempts to classify groups of related line segments as text, ruling lines, signatures) other line art, or miscellaneous items. Then everything classified as text is processed by Bairdʹs language-free layout analysis system so that a postprocessor can use the geometric layout to refine decisions about what is text and what is nontext. This could then be further processed to identify complex objects such as tables, signature blocks, and line drawings. In order to recognize signatures and to separate them from ruling lines and components of line drawings, line segments from skeletonization need to be strung together by a curve-fitting process. After long, fairly straight lines are found and set aside, a more lenient criterion strings together pairs of segments to form the groups on which to run the fast classifier.
  • Keywords
    glucose transport , phosphatase inhibitors , differentiation , HD3 cells
  • Journal title
    COMPUTER VISION & IMAGE UNDERSTANDING
  • Serial Year
    2000
  • Journal title
    COMPUTER VISION & IMAGE UNDERSTANDING
  • Record number

    33969