• DocumentCode
    2748547
  • Title

    Language model of Chinese character recognition and its application

  • Author

    Zhang, Sheng ; Wu, Xianli

  • Author_Institution
    Inst. of Autom., Acad. Sinica, Beijing, China
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1507
  • Abstract
    This paper presents a 5-gram combined model that can reflect features of Chinese and Chinese character recognition based on introducing several kinds of Markov language models. The major feature of this model is that it captures both forward and backward statistical characters of one word. The model contains three traditional “trigram components”, a “cache component” which reflects short-term patterns of word use, and a “3g-gram component” based on a new classification method that is fast and automatic. Experiment on a 1500000-word corpus shows significant improvement achieved by the proposed model
  • Keywords
    character recognition; statistical analysis; 5-gram combined model; Chinese character recognition; Markov language models; backward statistical characters; cache component; forward statistical characters; language model; trigram components; Character recognition; Error correction; Handwriting recognition; History; Ink; Natural languages; Probability; Random processes; Speech processing; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-5747-7
  • Type

    conf

  • DOI
    10.1109/ICOSP.2000.893386
  • Filename
    893386