Title :
Symmetric is Not the Optimal Local Context Window in Chinese Word Sense Disambiguation
Author :
Li, Gang ; Kou, Guangzeng ; Quan, Ji
Author_Institution :
Sch. of Inf. Manage., Wuhan Univ., Wuhan, China
Abstract :
Word sense disambiguation (WSD) is a task of classification, where the local context is the basic features to identify the sense of ambiguous word. Most systems choose optimal local context window on empirical grounds, which is usually symmetric, the distance from the ambiguous word to both sides of the window is same, such as [-1, +1] or [-2, +2]. Is symmetric window better than asymmetric window? In this paper, we take Senseval-3 Chinese data set as example. First find the optimal window estimated by cross-validation using only the training set, which is a symmetric window. Then, perform a WSD evaluation on the test data using this symmetric window for comparison with other classical symmetric window. The results show that asymmetric is better than symmetric window, and symmetric window is not always the best option.
Keywords :
computational linguistics; pattern classification; Chinese word sense disambiguation; Senseval-3 Chinese data set; asymmetric window; computational linguistics; cross-validation estimation; optimal local context window; symmetric window; Computer science; Feature extraction; Information management; Information technology; Performance evaluation; Speech; Statistical analysis; Systems engineering and theory; Testing; Training data; Word Sense Disambiguation; context window; local context;
Conference_Titel :
Information Technology and Computer Science, 2009. ITCS 2009. International Conference on
Conference_Location :
Kiev
Print_ISBN :
978-0-7695-3688-0
DOI :
10.1109/ITCS.2009.47