• DocumentCode
    2016069
  • Title

    A Shared Parts Model for Document Image Recognition

  • Author

    Gupta, Mithun D. ; Sarkar, Prateek

  • Author_Institution
    Univ. of Illinois, Urbana
  • Volume
    2
  • fYear
    2007
  • fDate
    23-26 Sept. 2007
  • Firstpage
    1163
  • Lastpage
    1172
  • Abstract
    We address document image classification by visual appearance. An image is represented by a variable-length list of visually salient features. A hierarchical Bayesian network is used to model the joint density of these features. This model promotes generalization from a few samples by sharing component probability distributions among different categories, and by factoring out a common displacement vector shared by all features within an image. The Bayesian network is implemented as a factor graph, and parameter estimation and inference are both done by loopy belief propagation. We explain and illustrate our model on a simple shape classification task. We obtain close to 90% accuracy on classifying journal articles from memos in the UWASH-II dataset, as well as on other classification tasks on a home-grown data set of technical articles.
  • Keywords
    belief networks; document image processing; image classification; parameter estimation; statistical distributions; Bayesian network; component probability distributions; document image classification; document image recognition; loopy belief propagation; parameter estimation; parameter inference; shared parts model; Bayesian methods; Belief propagation; Image classification; Image recognition; Indexing; Information retrieval; Optical character recognition software; Parameter estimation; Probability distribution; Shape;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
  • Conference_Location
    Parana
  • ISSN
    1520-5363
  • Print_ISBN
    978-0-7695-2822-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.2007.4377098
  • Filename
    4377098