Title :
A Dynamic and Self-study Language Model Oriented to Chinese Characters Input
Author :
Pei-feng, Li ; Ping, Gu ; Qiao-Ming, Zhu
Author_Institution :
Sch. of Comput. Sci. & Technol., Soochow Univ., Suzhou
Abstract :
In this paper, a statistic language model is put forward to predict the next inputting word to improve the performance of the input method. So this paper constructs a general language model and a user language model, and then combines them into a new language model which was called as dynamic and self-study language model. Using the general language model in our experiment, the average length of input codes (ALIC) is reduced from 2.557 to 2.479 and the hit rate of first characters (HRFC) is also improved from 78.704% to 96.202%. Using the dynamic and self-study language model in our experiment, when the number of inputted Chinese characters is less then 20 thousand, the HRFC increases rapidly, while the ALIC reduces rapidly. And when the number is greater than 20 thousand, the HRFC and ALIC become steady. Thus it´s clear that dynamic and self-study language model performs well in input method. Otherwise, we provide a modified Church-Gale smoothing method to reduce the size of general language model. This method can reduce the size to 5 percent in order to fit the request of handheld device
Keywords :
natural languages; smoothing methods; statistical analysis; Chinese characters input; Church-Gale smoothing method; average length of input codes; dynamic language model; general language model; hit rate of first characters; self-study language model; statistic language model; user language model; Asia; Computer science; Design methodology; Handheld computers; Keyboards; Natural languages; Predictive models; Probability; Smoothing methods; Statistics;
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2006. SNPD 2006. Seventh ACIS International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
0-7695-2611-X
DOI :
10.1109/SNPD-SAWN.2006.3