Title :
On the use of duration-corrected N-best hypotheses for text recognition in gray-scale document images
Author :
Yen, Chinching ; Kuo, Shyh Shiaw
Author_Institution :
AT&T Bell Labs., Somerset, NJ, USA
Abstract :
The pseudo two dimensional hidden Markov model (PHMM) is extended to directly recognize poorly-printed gray-scale document images. The N-best hypotheses search, coupled with duration correction, is also developed to find best candidates. Experimental results have demonstrated that the performance of the new system has been significantly improved when compared to the original PHMM system [Kuo and Agazzi, 1994] using binary images as inputs. The recognition rate improves from 97.7% to 100%, over a testing set with similar blur and noise conditions as the training set. For a testing range far outside the training one, it improves from 89.59% to 98.51%, which also demonstrates the robustness of the proposed system
Keywords :
document image processing; hidden Markov models; image recognition; optical character recognition; duration-corrected N-best hypotheses; gray-scale document images; performance; pseudo two dimensional hidden Markov model; recognition rate; robustness; text recognition; Automatic speech recognition; Gray-scale; Hidden Markov models; Image recognition; Kernel; Noise robustness; Optical character recognition software; System testing; Text recognition; Viterbi algorithm;
Conference_Titel :
Image Processing, 1995. Proceedings., International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-8186-7310-9
DOI :
10.1109/ICIP.1995.537642