• DocumentCode
    2056487
  • Title

    Page segmentation using document model

  • Author

    Jain, Anil K. ; Yu, Bin

  • Author_Institution
    Dept. of Comput. Sci., Michigan State Univ., East Lansing, MI, USA
  • Volume
    1
  • fYear
    1997
  • fDate
    18-20 Aug 1997
  • Firstpage
    34
  • Abstract
    Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval and interpretation continues to be a challenging problem. An efficient document model is necessary to solve this problem. Document modeling involves techniques of thresholding, skew detection, geometric layout analysis and logical layout analysis. The derived model can then be used in document storage and retrieval. We use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer and logical analysis
  • Keywords
    document image processing; feature extraction; image segmentation; information retrieval; connected component extraction; document model; document modeling; document retrieval; document storage; geometric layout analysis; interactive editing; logical analysis; logical layout analysis; page segmentation; region identification; skew detection; thresholding; top-down generation information; Computer science; Data mining; Feature extraction; Image analysis; Image retrieval; Image segmentation; Image storage; Information analysis; Information retrieval; Solid modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
  • Conference_Location
    Ulm
  • Print_ISBN
    0-8186-7898-4
  • Type

    conf

  • DOI
    10.1109/ICDAR.1997.619809
  • Filename
    619809