Title :
Lexicon-driven word recognition
Author :
Chen, Chien-Huei
Author_Institution :
Inf., Telecommun., & Autom. Div., SRI Int., Menlo Park, CA, USA
Abstract :
Most conventional document understanding systems use lexicons only in a postprocessing step to verify or correct character recognition results. The authors present a new approach to word recognition that uses a lexicon to “drive” the recognition process. Lexicon words are encoded in trie data structures, and recognition of a word image is done by searching a lexicon trie for a path whose node characters yield the best match to the word image. This approach has two important advantages. First, it is segmentation-free; there is no need to presegment the text image into isolated characters. Second, it performs recognition by verifying character hypotheses, as opposed to the classification method used in most conventional optical character recognition (OCR) systems. Hence, the recognition process is more efficient and the results are more accurate. They demonstrated the feasibility and the advantage of this approach with a lexicon size of more than 50000 words, on severely degraded images
Keywords :
data structures; document image processing; image matching; image recognition; best match; character hypothesis verification; document understanding systems; lexicon trie searching; lexicon-driven word recognition; node characters; segmentation-free method; severely degraded images; trie data structures; word image recognition; Automation; Character generation; Character recognition; Data structures; Degradation; Image recognition; Image resolution; Image segmentation; Intelligent systems; Optical character recognition software;
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
DOI :
10.1109/ICDAR.1995.602051