Title :
Exploiting external knowledge sources to improve kernel-based Word Sense Disambiguation
Author :
Jin, Peng ; Li, Fuxin ; Zhu, Danqing ; Wu, Yunfang ; Yu, Shiwen
Author_Institution :
Inst. of Comput. Linguistics, Peking Univ., Beijing
Abstract :
This paper proposes a novel approach to improve the kernel-based word sense disambiguation (WSD). We first explain why linear kernels are more suitable to WSD and many other natural language processing problems than translation-invariant kernels. Based on the linear kernel, two external knowledge sources are integrated. One comprises a set of linguistic rules to find the crucial features. For the other, a distributional similarity thesaurus is used to alleviate data sparseness by generalizing crucial features when they do not match the word-form exactly. The experiments show that we have outperformed the state-of-the-art system on the benchmark data from English lexical sample task of SemEval-2007 and the improvement is statistically significant.
Keywords :
linguistics; natural language processing; support vector machines; thesauri; English lexical sample task; SemEval-2007; data sparseness; distributional similarity thesaurus; external knowledge sources; kernel-based word sense disambiguation; linear kernels; linguistic rules; natural language processing problems; support vector machine; Automation; Computational linguistics; Entropy; Kernel; Learning systems; Machine learning; Natural language processing; Support vector machines; Thesauri; Training data; kernel based method; support vector machine; word sense disambiguation;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-4515-8
Electronic_ISBN :
978-1-4244-2780-2
DOI :
10.1109/NLPKE.2008.4906810