مرکز منطقه ای اطلاع رساني علوم و فناوري - Utilizing background corpus and dictionary to calculate similarity between unknown words

DocumentCode :

2137251

Title :

Utilizing background corpus and dictionary to calculate similarity between unknown words

Author :

Fan, Xinghua ; Chen, Xianlin ; Hu, Hongge

Author_Institution :

College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, China

fYear :

2010

fDate :

4-6 Dec. 2010

Firstpage :

1669

Lastpage :

1672

Abstract :

This paper presents a method of utilizing background corpus and dictionary to calculate similarity between unknown words. In the method, the best concept expression of unknown word in corpus was obtained from the background of it, then constructed context for the best concept expression. The connotation meaning of unknown word was determined by the difference between the context of the best concept expression and its own context. The similarity between unknown words was calculated by utilizing semantic dictionary. This method avoids the problems of mistaken segmentation and abused segmentation, which exist in the traditional method of calculating similarity between unknown words, which is based on segmentation strategy. Experimental results show that the method proposed in this paper is high effective.

Keywords :

Computational modeling; Computer science; Context; Dictionaries; Semantics; Statistical analysis; Telecommunications; HowNet; segmentation; similarity of words; unknown word;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information Science and Engineering (ICISE), 2010 2nd International Conference on

Conference_Location :

Hangzhou, China

Print_ISBN :

978-1-4244-7616-9

Type :

conf

DOI :

10.1109/ICISE.2010.5690768

Filename :

5690768

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2137251