• DocumentCode
    1581384
  • Title

    Document image decoding using Iterated Complete Path search with subsampled heuristic scoring

  • Author

    Bloomberg, Dan S. ; Minka, Thomas P. ; Popat, Kris

  • Author_Institution
    Xerox Palo Alto Res. Center, CA, USA
  • fYear
    2001
  • fDate
    6/23/1905 12:00:00 AM
  • Firstpage
    344
  • Lastpage
    349
  • Abstract
    It has been shown that the computation time of document image decoding can be significantly reduced by employing heuristics in the search for the best decoding of a text line. In the Iterated Complete Path (ICP) method, template matches are performed only along the best path found by dynamic programming on each iteration. When the best path stabilizes, the decoding is optimal and no more template matches need to be performed. In this way, only a tiny fraction of potential template matches must be evaluated, and the computation time is typically dominated by the evaluation of the initial heuristic upper bound for each template at each location in the image. The time to compute this bound depends on the resolution at which the matching scores are found. At lower resolution, the heuristic computation is reduced, but because a weaker bound is used, the number of Viterbi iterations is increased. We present the optimal (lowest upper-bound) heuristic for any degree of subsampling of multilevel template and/or interpolation, for use in text line decoding with ICP. The optimal degree of subsampling depends on image quality, but it is typically found that a small amount of template subsampling is effective in reducing the overall decoding time
  • Keywords
    document image processing; dynamic programming; heuristic programming; image coding; image matching; maximum likelihood estimation; search problems; text analysis; ICP method; Iterated Complete Path search; Viterbi iterations; computation time; document image decoding; dynamic programming; heuristic computation; initial heuristic upper-bound; interpolation; matching scores; multilevel template; optimal heuristic; subsampled heuristic scoring; subsampling; template matches; template subsampling; text line decoding; Automata; Character generation; Dynamic programming; Image quality; Interpolation; Iterative closest point algorithm; Iterative decoding; Noise generators; Printing; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
  • Conference_Location
    Seattle, WA
  • Print_ISBN
    0-7695-1263-1
  • Type

    conf

  • DOI
    10.1109/ICDAR.2001.953811
  • Filename
    953811