Title :
Applying the Word Acquiring Algorithm to the Pinyin-to-Character Conversion
Author :
Wei, Jiang ; Li, Pang Xiu
Author_Institution :
Res. Center of Inf. Manage. & Inf. Syst., Harbin Inst. of Technol., Harbin, China
Abstract :
This paper applies the information entropy based word acquiring algorithm to the task of Pinyin-to-character (PTC) conversion, which adopts artificial immune network model. Firstly, the artificial immune network is used to overcome the sparse data problem and the independent identical distribution (iid.) assumption. Secondly, the word acquiring algorithm based on information entropy is presented to collect the Chinese word and some typically combinations. The experiments show that our method can achieve a better performance than the n-gram language model, and this kind of improvement is hardly acquired by the classical supervised learning models. In addition, the word acquiring method is applied, and further improves the PTC performance.
Keywords :
artificial immune systems; learning (artificial intelligence); word processing; Chinese word; Pinyin-to-character conversion; artificial immune network model; independent identical distribution; information entropy; sparse data problem; supervised learning models; word acquiring algorithm; word acquiring method; Dictionaries; Electronic mail; Error analysis; Feedback; Information entropy; Information management; Management information systems; Natural languages; Supervised learning; Support vector machines; Information Entropy; Pinyin-to-Character Conversion; Word Acquiring Algorithm;
Conference_Titel :
Natural Computation, 2009. ICNC '09. Fifth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-0-7695-3736-8
DOI :
10.1109/ICNC.2009.568