DocumentCode
130979
Title
A novel unsupervised non-iterative approach to word segmentation
Author
Hanshi Wang ; Haining Xu ; Lizhen Liu ; Wei Song ; Jingli Lu
Author_Institution
Inf. & Eng. Coll., Capital Normal Univ., Beijing, China
fYear
2014
fDate
27-29 June 2014
Firstpage
824
Lastpage
827
Abstract
Word segmentation is the crucial first step of natural language understanding (NLU) for Chinese corpora. Our early paper presented "Evaluation Selection and Adjustment" (ESA), an unsupervised approach to word segmentation. In this article, we present a novel non-iterative variation of ESA, comparing it with other similar methods and we get better performance. Besides that, we analyze "Balancing" and compare it with "Standardizing", another algorithm, to solve the problem of "how to evaluate the words of different lengths and compare them with each other for statistical methods of word segmentation". The experimental results show that "Balancing" is more effective than "Standardizing" for the task.
Keywords
feature selection; natural language processing; text analysis; word processing; Chinese corpora; ESA; NLU; evaluation selection and adjustment; natural language understanding; unsupervised noniterative approach; word segmentation; non-iterative; standardizing; unsupervised; word segmentation;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
Conference_Location
Beijing
ISSN
2327-0586
Print_ISBN
978-1-4799-3278-8
Type
conf
DOI
10.1109/ICSESS.2014.6933693
Filename
6933693
Link To Document