DocumentCode :
2137251
Title :
Utilizing background corpus and dictionary to calculate similarity between unknown words
Author :
Fan, Xinghua ; Chen, Xianlin ; Hu, Hongge
Author_Institution :
College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, China
fYear :
2010
fDate :
4-6 Dec. 2010
Firstpage :
1669
Lastpage :
1672
Abstract :
This paper presents a method of utilizing background corpus and dictionary to calculate similarity between unknown words. In the method, the best concept expression of unknown word in corpus was obtained from the background of it, then constructed context for the best concept expression. The connotation meaning of unknown word was determined by the difference between the context of the best concept expression and its own context. The similarity between unknown words was calculated by utilizing semantic dictionary. This method avoids the problems of mistaken segmentation and abused segmentation, which exist in the traditional method of calculating similarity between unknown words, which is based on segmentation strategy. Experimental results show that the method proposed in this paper is high effective.
Keywords :
Computational modeling; Computer science; Context; Dictionaries; Semantics; Statistical analysis; Telecommunications; HowNet; segmentation; similarity of words; unknown word;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
Type :
conf
DOI :
10.1109/ICISE.2010.5690768
Filename :
5690768
Link To Document :
بازگشت