DocumentCode
1486843
Title
Incorporating language syntax in visual text recognition with a statistical model
Author
Hull, Jonathan J.
Author_Institution
Ricoh California Res. Center, Menlo Park, CA, USA
Volume
18
Issue
12
fYear
1996
fDate
12/1/1996 12:00:00 AM
Firstpage
1251
Lastpage
1255
Abstract
The use of a statistical language model to improve the performance of an algorithm for recognizing digital images of handwritten or machine-printed text is discussed. A word recognition algorithm first determines a set of words (called a neighborhood) from a lexicon that are visually similar to each input word image. Syntactic classifications for the words and the transition probabilities between those classifications are input to the Viterbi algorithm. The Viterbi algorithm determines the sequence of syntactic classes (the states of an underlying Markov process) for each sentence that have the maximum a posteriori probability, given the observed neighborhoods. The performance of the word recognition algorithm is improved by removing words from neighborhoods with classes that are not included on the estimated state sequence. An experimental application is demonstrated with a neighborhood generation algorithm that produces a number of guesses about the identity of each word in a running text. The use of zero, first and second order transition probabilities and different levels of noise in estimating the neighborhood are explored
Keywords
Markov processes; Viterbi detection; grammars; image classification; image recognition; natural languages; optical character recognition; statistical analysis; Markov process; Viterbi algorithm; digital image recognition; handwritten text; language syntax; lexicon; machine-printed text; maximum a posteriori probability; neighborhood; statistical language model; syntactic class sequence; visual text recognition; word recognition algorithm; Character recognition; Computational modeling; Degradation; Hidden Markov models; Image analysis; Image recognition; Speech recognition; Text analysis; Text recognition; Viterbi algorithm;
fLanguage
English
Journal_Title
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher
ieee
ISSN
0162-8828
Type
jour
DOI
10.1109/34.546261
Filename
546261
Link To Document