• DocumentCode
    2011060
  • Title

    Improving Handwritten Chinese Text Recognition by Unsupervised Language Model Adaptation

  • Author

    Wang, Qiu-Feng ; Yin, Fei ; Liu, Cheng-Lin

  • Author_Institution
    Nat. Lab. of Pattern Recognition (NLPR), Inst. of Autom., Beijing, China
  • fYear
    2012
  • fDate
    27-29 March 2012
  • Firstpage
    110
  • Lastpage
    114
  • Abstract
    This paper investigates the effects of unsupervised language model adaptation (LMA) in handwritten Chinese text recognition. For no prior information of recognition text is available, we use a two-pass recognition strategy. In the first pass, the generic language model (LM) is used to get a preliminary result, which is used to choose the best matched LMs from a set of pre-defined domains, then the matched LMs are used in the second pass recognition. Each LM is compressed to a moderate size via the entropy-based pruning, tree-structure formatting and fewer-byte quantization. We evaluated the LMA for five LM types, including both character-level and word-level ones. Experiments on the CASIA-HWDB database show that language model adaptation improves the performance for each LM type in all domains. The documents of ancient domain gained the biggest improvement of character-level correct rate of 5.87 percent up and accurate rate of 6.05 percent up.
  • Keywords
    handwritten character recognition; natural language processing; tree data structures; CASIA-HWDB database; character-level; entropy-based pruning; fewer-byte quantization; handwritten Chinese text recognition; tree-structure formatting; two-pass recognition strategy; unsupervised language model adaptation; word-level; Adaptation models; Character recognition; Context; Context modeling; Handwriting recognition; Text recognition; Handwritten Chinese text recognition; Language model adaptation; Language model compression; Two-pass recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
  • Conference_Location
    Gold Cost, QLD
  • Print_ISBN
    978-1-4673-0868-7
  • Type

    conf

  • DOI
    10.1109/DAS.2012.46
  • Filename
    6195345