• DocumentCode
    3489826
  • Title

    Exploring MPE/MWE Training for Chinese Handwriting Recognition

  • Author

    Tonghua Su ; Peijun Ma ; Tong Wei ; Shu Liu ; Shengchun Deng

  • Author_Institution
    Sch. of Software, Harbin Inst. of Technol., Harbin, China
  • fYear
    2013
  • fDate
    25-28 Aug. 2013
  • Firstpage
    1275
  • Lastpage
    1279
  • Abstract
    The HMM-based segmentation-free strategy for Chinese handwriting recognition has the merit that the model parameters can be trained with text line samples without annotation of character boundaries. However, the recognition performance has been limited to the general maximum likelihood estimation framework. In this paper, we investigate the discriminative training framework based on MPE/MWE criteria in the context of Chinese handwriting recognition for the first time. It optimizes a objective function that is a smooth measure of recognition error. Then EBW procedure is used to solve such criteria. Some key issues for robust MPE/MWE training are explored. We reveal that MPE/MWE requires more training samples, however, Chinese handwriting recognition poses severe data sparsity problem. We explore the sample synthesizing to help the training process. Experiments are conducted on Chinese handwriting database and the effectiveness of MPE/MWE training is manifested. In particular, at least 28% error reduction of recognition rates is observed in MPE/MWE training with 50 copies of synthetic sample when big ram is used to approximate the language model.
  • Keywords
    handwriting recognition; hidden Markov models; maximum likelihood estimation; natural language processing; text analysis; visual databases; Chinese handwriting database; Chinese handwriting recognition; EBW procedure; HMM-based segmentation-free strategy; MPE criteria; MWE criteria; bigram; character boundary annotation; data sparsity problem; discriminative training framework; maximum likelihood estimation framework; model parameter; recognition error; recognition performance; robust MPE training; robust MWE training; text line samples; Handwriting recognition; Hidden Markov models; Lattices; Linear programming; Maximum likelihood estimation; Speech recognition; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2013.258
  • Filename
    6628819