مرکز منطقه ای اطلاع رساني علوم و فناوري - Exploring MPE/MWE Training for Chinese Handwriting Recognition

DocumentCode :

3489826

Title :

Exploring MPE/MWE Training for Chinese Handwriting Recognition

Author :

Tonghua Su ; Peijun Ma ; Tong Wei ; Shu Liu ; Shengchun Deng

Author_Institution :

Sch. of Software, Harbin Inst. of Technol., Harbin, China

fYear :

2013

fDate :

25-28 Aug. 2013

Firstpage :

1275

Lastpage :

1279

Abstract :

The HMM-based segmentation-free strategy for Chinese handwriting recognition has the merit that the model parameters can be trained with text line samples without annotation of character boundaries. However, the recognition performance has been limited to the general maximum likelihood estimation framework. In this paper, we investigate the discriminative training framework based on MPE/MWE criteria in the context of Chinese handwriting recognition for the first time. It optimizes a objective function that is a smooth measure of recognition error. Then EBW procedure is used to solve such criteria. Some key issues for robust MPE/MWE training are explored. We reveal that MPE/MWE requires more training samples, however, Chinese handwriting recognition poses severe data sparsity problem. We explore the sample synthesizing to help the training process. Experiments are conducted on Chinese handwriting database and the effectiveness of MPE/MWE training is manifested. In particular, at least 28% error reduction of recognition rates is observed in MPE/MWE training with 50 copies of synthetic sample when big ram is used to approximate the language model.

Keywords :

handwriting recognition; hidden Markov models; maximum likelihood estimation; natural language processing; text analysis; visual databases; Chinese handwriting database; Chinese handwriting recognition; EBW procedure; HMM-based segmentation-free strategy; MPE criteria; MWE criteria; bigram; character boundary annotation; data sparsity problem; discriminative training framework; maximum likelihood estimation framework; model parameter; recognition error; recognition performance; robust MPE training; robust MWE training; text line samples; Handwriting recognition; Hidden Markov models; Lattices; Linear programming; Maximum likelihood estimation; Speech recognition; Training;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Document Analysis and Recognition (ICDAR), 2013 12th International Conference on

Conference_Location :

Washington, DC

ISSN :

1520-5363

Type :

conf

DOI :

10.1109/ICDAR.2013.258

Filename :

6628819

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3489826