• DocumentCode
    3130336
  • Title

    Page segmentation and content classification for automatic document image processing

  • Author

    Yip, S.K. ; Chi, Z.

  • Author_Institution
    Center for Multimedia Image Process., Hong Kong Polytech. Univ., China
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    279
  • Lastpage
    282
  • Abstract
    Page segmentation and image content classification is an important step for automatic document image processing including mixed-type document image compression, form and check reading, and mail sorting. The authors first propose an enhanced background thinning based page segmentation approach. They then present a hierarchical approach for the classification of the segmented sub-images into one of two categories: text and picture. The approach combines a cross-correlation method, the Kolmogorov complexity measure (A.N. Kolmogorov, 1965), and a neural network classifier in order to achieve both efficiency and high accuracy. Our approach has been tested on a number of mixed-type document images with good results
  • Keywords
    computational complexity; data compression; document image processing; image classification; image coding; image thinning; neural nets; Kolmogorov complexity measure; automatic document image processing; check reading; content classification; cross-correlation method; enhanced background thinning based page segmentation approach; hierarchical approach; image content classification; mail sorting; mixed-type document image compression; neural network classifier; page segmentation; segmented sub-image classification; Character recognition; Correlation; Document image processing; Image analysis; Image coding; Image processing; Image segmentation; Neural networks; Postal services; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Multimedia, Video and Speech Processing, 2001. Proceedings of 2001 International Symposium on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    962-85766-2-3
  • Type

    conf

  • DOI
    10.1109/ISIMP.2001.925388
  • Filename
    925388