Title :
Automatically acquiring part of speech correcting rules of multi-category words based on incomplete decision tables
Author :
Wang, Suge ; Yang, Junling ; Li, Deyu ; Zhang, Wu
Author_Institution :
Sch. of Comput. Eng. & Sci., Shanghai Univ., China
fDate :
30 Oct.-1 Nov. 2005
Abstract :
Part of speech (POS) tagging is a basic subject for Chinese information processing. In general, the existence of multi-category words greatly affects the processing quality of corpora. High efficient methods and automatically correcting techniques for multi-category word tagging are the keys for improving tagging precision. In this paper, for part of speech correcting of multi-category word, a modeling method is introduced based on an incomplete decision table and two algorithms for attribute reduction and object reduction used for automatically acquiring correcting rules are presented based on attribute significance. The results of testing show the validity of our method for improving part of speech tagging precision in large corpora engineering.
Keywords :
computational linguistics; decision tables; natural languages; speech processing; Chinese information processing; POS; decision tables; multicategory word tagging; part of speech; Context modeling; Information processing; Information technology; Large-scale systems; Mathematics; Set theory; Speech processing; Statistics; Tagging; Testing;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
DOI :
10.1109/NLPKE.2005.1598709