DocumentCode :
1409490
Title :
Handwritten Chinese Text Recognition by Integrating Multiple Contexts
Author :
Wang, Qiu-Feng ; Yin, Fei ; Liu, Cheng-Lin
Author_Institution :
Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Volume :
34
Issue :
8
fYear :
2012
Firstpage :
1469
Lastpage :
1481
Abstract :
This paper presents an effective approach for the offline recognition of unconstrained handwritten Chinese texts. Under the general integrated segmentation-and-recognition framework with character oversegmentation, we investigate three important issues: candidate path evaluation, path search, and parameter estimation. For path evaluation, we combine multiple contexts (character recognition scores, geometric and linguistic contexts) from the Bayesian decision view, and convert the classifier outputs to posterior probabilities via confidence transformation. In path search, we use a refined beam search algorithm to improve the search efficiency and, meanwhile, use a candidate character augmentation strategy to improve the recognition accuracy. The combining weights of the path evaluation function are optimized by supervised learning using a Maximum Character Accuracy criterion. We evaluated the recognition performance on a Chinese handwriting database CASIA-HWDB, which contains nearly four million character samples of 7,356 classes and 5,091 pages of unconstrained handwritten texts. The experimental results show that confidence transformation and combining multiple contexts improve the text line recognition performance significantly. On a test set of 1,015 handwritten pages, the proposed approach achieved character-level accurate rate of 90.75 percent and correct rate of 91.39 percent, which are superior by far to the best results reported in the literature.
Keywords :
Bayes methods; handwritten character recognition; learning (artificial intelligence); natural languages; pattern classification; probability; search problems; text analysis; Bayesian decision; CASIA-HWDB; Chinese handwriting database; beam search algorithm; candidate character augmentation strategy; character oversegmentation; character recognition scores; character-level accurate rate; character-level correct rate; classifier; confidence transformation; geometric contexts; handwritten Chinese text offline recognition; handwritten pages; integrated segmentation-and-recognition framework; linguistic contexts; maximum character accuracy criterion; multiple contexts; parameter estimation; path evaluation function optimization; path search; posterior probabilities; recognition accuracy improvement; recognition performance; search efficiency improvement; supervised learning; text line recognition performance improvement; unconstrained handwritten texts; Character recognition; Context; Handwriting recognition; Hidden Markov models; Image segmentation; Lattices; Text recognition; Handwritten Chinese text recognition; candidate character augmentation; confidence transformation; geometric models; language models; maximum character accuracy training.; refined beam search; Algorithms; Bayes Theorem; Databases, Factual; Handwriting; Humans; Image Processing, Computer-Assisted; Pattern Recognition, Automated;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/TPAMI.2011.264
Filename :
6112767
Link To Document :
بازگشت