DocumentCode
2734692
Title
One Sense per N-gram
Author
Pengyuan Liu ; Shui Liu ; Shiqi Li ; Shiwen Yu
Author_Institution
Inst. of Comput. Linguistics, Peking Univ., Beijing, China
Volume
3
fYear
2010
fDate
Aug. 31 2010-Sept. 3 2010
Firstpage
195
Lastpage
198
Abstract
This paper presents a novel supposition, One Sense Per N-gram (N > 1), which we believe is appropriate for more linguistic phenomena and can serve as a general version instead of the celebrated One Sense Per Collocation supposition, at least in Chinese language. This new supposition is based on our observation of the error detection process of annoted sense in People´s Daily that are tagged by an automatic WSD system. Our preliminary experiment on Chinese Word Sense Tagging Corpus shows that it holds with over 85.9% agreement for both nouns and verbs. Based on the supposition we build a prototype naïve Bayes WSD system and tested on Multilingual Chinese-English Lexical Sample task (MCELS) in Semeval-2007. Experimental results show our prototype system can promote the performance of baseline system by 2.7%.
Keywords
Bayes methods; natural language processing; Chinese language; Chinese word sense tagging corpus; People´s Daily; Semeval-2007; automatic word sense disambiguation system; error detection process; linguistic phenomena; multilingual Chinese-English lexical sample task; naïve Bayes word sense disambiguation system; one sense per N-gram; Conferences; Context; Entropy; Prototypes; Semantics; Tagging; Training; One sense per N-gram; language model; word sense disambiguation; word sense tagging;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on
Conference_Location
Toronto, ON
Print_ISBN
978-1-4244-8482-9
Electronic_ISBN
978-0-7695-4191-4
Type
conf
DOI
10.1109/WI-IAT.2010.268
Filename
5614213
Link To Document