DocumentCode
2748547
Title
Language model of Chinese character recognition and its application
Author
Zhang, Sheng ; Wu, Xianli
Author_Institution
Inst. of Autom., Acad. Sinica, Beijing, China
Volume
3
fYear
2000
fDate
2000
Firstpage
1507
Abstract
This paper presents a 5-gram combined model that can reflect features of Chinese and Chinese character recognition based on introducing several kinds of Markov language models. The major feature of this model is that it captures both forward and backward statistical characters of one word. The model contains three traditional “trigram components”, a “cache component” which reflects short-term patterns of word use, and a “3g-gram component” based on a new classification method that is fast and automatic. Experiment on a 1500000-word corpus shows significant improvement achieved by the proposed model
Keywords
character recognition; statistical analysis; 5-gram combined model; Chinese character recognition; Markov language models; backward statistical characters; cache component; forward statistical characters; language model; trigram components; Character recognition; Error correction; Handwriting recognition; History; Ink; Natural languages; Probability; Random processes; Speech processing; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th International Conference on
Conference_Location
Beijing
Print_ISBN
0-7803-5747-7
Type
conf
DOI
10.1109/ICOSP.2000.893386
Filename
893386
Link To Document