DocumentCode
588746
Title
Analysis on Effect Range of Context in Chinese Word Segmentation Based Word-Position Tagging
Author
Xijie Wang ; An Guo
Author_Institution
Sch. of Comput. & Inf. Eng., Anyang Normal Univ., Anyang, China
fYear
2012
fDate
2-4 Nov. 2012
Firstpage
552
Lastpage
555
Abstract
Chinese word segmentation (CWS) can be transformed into word-position-based approaches by conditional random field (CRF). It improved the performance of Chinese word segmentation greatly which makes it in wide use recently. When training on corpus with CRF, the size of feature window is the key to the training effect. To analyze the effect range of context, string sequence tagging segmentations are performed on Bakeoff2005 with toolkit CRF++0.53 and the results are: (1) contribution of below is greater than above, (2) size of feature window influencing the segment performance is no larger than 5, the proper size is four or five.
Keywords
identification technology; natural language processing; random processes; word processing; Bakeoff2005; CRF++0.53 toolkit; CWS; Chinese word segmentation; conditional random field; effect range analysis; feature window; string sequence tagging segmentation; training effect; word-position tagging; Computers; Context; Educational institutions; Indexes; Performance analysis; Tagging; Training; Chinese Word Segmentation; Conditional Random Field; Context; feature window;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Information Networking and Security (MINES), 2012 Fourth International Conference on
Conference_Location
Nanjing
Print_ISBN
978-1-4673-3093-0
Type
conf
DOI
10.1109/MINES.2012.76
Filename
6405616
Link To Document