• DocumentCode
    750082
  • Title

    Segmentation of document images

  • Author

    Taxt, T. ; Flynn, P.J. ; Jain, A.K.

  • Author_Institution
    Dept. of Comput. Sci., Michigan State Univ., East Lansing, MI, USA
  • Volume
    11
  • Issue
    12
  • fYear
    1989
  • Firstpage
    1322
  • Lastpage
    1329
  • Abstract
    Several methods for segmentation of document images (maps, drawings, etc.) are explored. The segmentation operation is posed as a statistical classification task with two pattern classes: print and background. A number of classification strategies are available. All require some prior information about the distribution of gray levels for the two classes. Training (either supervised or unsupervised) is employed to form these initial density estimates. Automatic updating of the class-conditional densities is performed within subregions in the image to adapt these global density estimates to the local image area. After local class-conditional densities have been obtained, each pixel is classified within the window using several techniques: a noncontextual Bayes classifier, Besag´s classifier, relaxation, Owen and Switzer´s classifier, and Haslett´s classifier. Four test images were processed. In two of these, the relaxation method performed best, and in the other two, the noncontextual method performed best. Automatic updating improved the results for both classifiers.<>
  • Keywords
    pattern recognition; picture processing; statistical analysis; Besag´s classifier; Haslett´s classifier; Owen and Switzer´s classifier; background; class-conditional densities; document image segmentation; drawings; gray level distribution; maps; noncontextual Bayes classifier; pattern recognition; picture processing; print; relaxation; statistical classification task; Councils; Data mining; Degradation; Digital images; Fading; Image databases; Image segmentation; Markov random fields; Storage automation; Testing;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/34.41371
  • Filename
    41371