• DocumentCode
    3249821
  • Title

    Page segmentation and classification based on pattern-list analysis

  • Author

    Wang, Jiajun ; Li, Yanling ; Huang, Xianwu ; He, Zhenya

  • Author_Institution
    Sch. of Electron. & Inf. Eng., Soochow Univ., Suzhou, China
  • fYear
    2004
  • fDate
    20-22 Oct. 2004
  • Firstpage
    735
  • Lastpage
    738
  • Abstract
    In this paper, a new algorithm based on pattern-list analysis is proposed for page segmentation and classification. There are three steps in the algorithm: the bounding rectangle location, the pattern formation and the pattern classification, after which the patterns that may be wrongly classified are further classified by their contextual information. Experimental results show the accuracy of the algorithm in segmenting text and non-text regions, especially for the case of document images with irregular-shaped halftone regions. The algorithm is valid only for binary document images.
  • Keywords
    document image processing; image classification; image segmentation; text analysis; binary document images; bounding rectangle location; contextual information; irregular-shaped halftone regions; nontext regions; page segmentation; pattern classification; pattern formation; pattern-list analysis; text segmentation; Algorithm design and analysis; Automation; Humans; Image analysis; Image segmentation; Interference; Pattern analysis; Pattern classification; Pixel; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on
  • Print_ISBN
    0-7803-8687-6
  • Type

    conf

  • DOI
    10.1109/ISIMP.2004.1434169
  • Filename
    1434169