• DocumentCode
    1147287
  • Title

    Document Ink Bleed-Through Removal with Two Hidden Markov Random Fields and a Single Observation Field

  • Author

    Wolf, Christian

  • Author_Institution
    INSA-Lyon, Univ. de Lyon, Villeurbanne, France
  • Volume
    32
  • Issue
    3
  • fYear
    2010
  • fDate
    3/1/2010 12:00:00 AM
  • Firstpage
    431
  • Lastpage
    447
  • Abstract
    We present a new method for blind document bleed-through removal based on separate Markov random field (MRF) regularization for the recto and for the verso side, where separate priors are derived from the full graph. The segmentation algorithm is based on Bayesian maximum a posteriori (MAP) estimation. The advantages of this separate approach are the adaptation of the prior to the contents creation process (e.g., superimposing two handwritten pages), and the improvement of the estimation of the recto pixels through an estimation of the verso pixels covered by recto pixels; moreover, the formulation as a binary labeling problem with two hidden labels per pixels naturally leads to an efficient optimization method based on the minimum cut/maximum flow in a graph. The proposed method is evaluated on scanned document images from the 18th century, showing an improvement of character recognition results compared to other restoration methods.
  • Keywords
    Bayes methods; character recognition; document image processing; graph theory; hidden Markov models; image restoration; image segmentation; maximum likelihood estimation; optimisation; Bayesian maximum a posteriori estimation; Markov random field regularization; binary labeling problem; blind document bleed-through removal; character recognition; document image restoration; document ink bleed-through removal; graph; hidden Markov random fields; maximum flow; minimum cut; optimization method; recto pixel estimation; segmentation algorithm; single observation field; Bayesian estimation; Markov random fields; document image restoration.; graph cuts;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2009.33
  • Filename
    4775903