Title :
Interpreting word recognition decisions with a document database graph
Author :
Hull, Jonathan J. ; Li, Yanhong
Author_Institution :
Dept. of Comput. Sci., State Univ. of New York, Buffalo, NY, USA
Abstract :
A method is presented to filter the output of a word recognition algorithm, which may contain errors, to locate decisions that should be correct with a high degree of certainty. The algorithm uses the output of a word recognition system and techniques used in information retrieval to characterize a free-text document database to locate a set of documents that have topics which are similar to that of the input document. The vocabulary from these similar documents is then used to locate the correct word recognition decisions. Experimental results show that a subset of the word recognition decisions for an input document can be located that are between 90 and 99% correct. The subset located by this method can be used to drive other recognition processes applied to the rest of the text
Keywords :
database management systems; document handling; optical character recognition; word processing; document database graph; free-text document database; information retrieval; input document; vocabulary; word recognition algorithm; word recognition decisions; Character recognition; Dictionaries; Filters; Image databases; Image recognition; Information retrieval; Text analysis; Text recognition; Visual databases; Vocabulary;
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
DOI :
10.1109/ICDAR.1993.395689