On the use of duration-corrected N-best hypotheses for text recognition in gray-scale document images

Author

Yen, Chinching ; Kuo, Shyh Shiaw

Author_Institution

AT&T Bell Labs., Somerset, NJ, USA

Volume

3

fYear

1995

fDate

23-26 Oct 1995

Firstpage

332

Abstract

The pseudo two dimensional hidden Markov model (PHMM) is extended to directly recognize poorly-printed gray-scale document images. The N-best hypotheses search, coupled with duration correction, is also developed to find best candidates. Experimental results have demonstrated that the performance of the new system has been significantly improved when compared to the original PHMM system [Kuo and Agazzi, 1994] using binary images as inputs. The recognition rate improves from 97.7% to 100%, over a testing set with similar blur and noise conditions as the training set. For a testing range far outside the training one, it improves from 89.59% to 98.51%, which also demonstrates the robustness of the proposed system

Keywords

document image processing; hidden Markov models; image recognition; optical character recognition; duration-corrected N-best hypotheses; gray-scale document images; performance; pseudo two dimensional hidden Markov model; recognition rate; robustness; text recognition; Automatic speech recognition; Gray-scale; Hidden Markov models; Image recognition; Kernel; Noise robustness; Optical character recognition software; System testing; Text recognition; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Image Processing, 1995. Proceedings., International Conference on

Conference_Location

Washington, DC

Print_ISBN

0-8186-7310-9

Type

conf

DOI

10.1109/ICIP.1995.537642

Filename

537642