Title :
General word recognition using approximate segment-string matching
Author_Institution :
CEDAR, State Univ. of New York, Buffalo, NY, USA
Abstract :
Focuses on the problem of isolated off-line general word recognition using an approximate stroke-segment/string matching algorithm. Several recently proposed word recognition algorithms use the strategy of directly matching the stroke segments (with OCR estimates) to the sequence of characters in each lexicon word. This idea works very well under ideal conditions; however, many applications require the recognition of text in the presence of document noise, poor handwriting and lexicon errors. These factors require careful design of the matching strategy such that a moderate amount of any form of degradation does not cause a recognition failure. A segment-to-string matching algorithm is proposed which robustly recovers from moderate levels of noise and system errors. This algorithm is developed in the context of a complete word recognition system and serves as its final post-processing module
Keywords :
document image processing; handwriting recognition; image matching; optical character recognition; string matching; BEAM search; CMWR algorithm; OCR estimates; approximate stroke-segment/string matching algorithm; character model word recognizer; character sequence; cursive script recognition; degraded document recognition; document noise; image degradation; isolated off-line general word recognition; lexicon errors; lexicon words; poor handwriting; post-processing module; recognition failure; segment-to-string matching algorithm; system errors; text recognition; Character recognition; Degradation; Handwriting recognition; Image recognition; Image segmentation; Noise level; Noise robustness; Optical character recognition software; Text recognition; World Wide Web;
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
DOI :
10.1109/ICDAR.1997.619820