DocumentCode :
1635886
Title :
A comparative study on Chinese word segmentation using statistical models
Author :
Wenchao, Meng ; Lianchen, Liu ; Anyan, Chen
Author_Institution :
Dept. of Autom., Tsinghua Univ., Beijing, China
fYear :
2010
Firstpage :
482
Lastpage :
486
Abstract :
Recent years, character based approaches to Chinese word segmentation task are developed, which show great success. In this paper, a detailed comparison among different statistical models are done. Three models (HMM, MEMM and CRF) are considered. First different tag sets are chosen to evaluate the models´ precision and efficiency. Then HMM and MEMM are compared with the similar features. At last different features are compared to measure which feature contributes most to Chinese word segmentation. Finally some suggestion is given for developing Chinese word segmentation systems.
Keywords :
hidden Markov models; natural language processing; statistical analysis; CRF; Chinese word segmentation; HMM; MEMM; statistical models; Entropy; Hidden Markov models; Joints; Mathematical model; Tagging; Technical Activities Guide - TAG; Training; comparison; erf; hmm; label bias; memm; observation bias;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering and Service Sciences (ICSESS), 2010 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6054-0
Type :
conf
DOI :
10.1109/ICSESS.2010.5552323
Filename :
5552323
Link To Document :
بازگشت