Title :
Text recognition enhancement with a probabilistic lattice chart parser
Author :
Hong, Tao ; Hull, Jonathan J.
Author_Institution :
CEDAR, State Univ. of New York, Buffalo, NY, USA
Abstract :
A probabilistic lattice chart parser is proposed for improving the performance of a text recognition technique. Digital images of words are recognized and alternatives for the identity of each are generated. Local word collocation statistics and a probabilistic chart parsing algorithm are used to determine the top N best parses for each sentence using the alternatives provided for the identity of each word by the recognition system. An approach in which text recognition and understanding are tightly integrated is discussed. An objective of this approach is to provide the capacity to process images of unrestricted English text. A large-scale lexicon, which supports the system, was acquired by training on corpora of over 3,000,000 words. The focus is on the implementation and performance of the probabilistic lattice chart parser
Keywords :
image recognition; natural languages; optical character recognition; probability; statistics; corpora; digital word images; large-scale lexicon; performance; probabilistic lattice chart parser; sentence analysis; text recognition enhancement; text understanding; training; unrestricted English text; word alternatives; word collocation statistics; Degradation; Digital images; Image recognition; Large-scale systems; Lattices; Natural languages; Performance analysis; Speech recognition; Text analysis; Text recognition;
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
DOI :
10.1109/ICDAR.1993.395744