DocumentCode :
1910229
Title :
Chinese Terminology Extraction Using Bilingual Web Resources
Author :
Yang, Yuhang ; Ji, Luning ; Lu, Qin ; Zhao, Tiejun
Author_Institution :
Harbin Inst. of Technol., Harbin
fYear :
2007
fDate :
Aug. 30 2007-Sept. 1 2007
Firstpage :
347
Lastpage :
354
Abstract :
Automatic terminology extraction requires termhood verification for extracted terms in a specific domain. Chinese terminology extraction suffers from insufficient domain corpora for verification even though there is abundance of information in other languages. This paper presents a novel approach to overcome this problem by using word translations and bilingual web resources to improve both coverage and precision. The proposed approach incorporates bilingual information from within candidate terms themselves and from existing domain knowledge to conduct termhood calculation. In contrast to previous researches, this method is not confined to only pre-determined corpora. Preliminary experiments show a 14.8% improvement in coverage and 26.3% improvement in precision, respectively.
Keywords :
Internet; language translation; linguistics; natural languages; vocabulary; automatic Chinese terminology extraction; bilingual Web resources; candidate terms; termhood verification; word translations; Counting circuits; Data mining; Frequency measurement; Laboratories; Natural language processing; Natural languages; Speech processing; Statistics; Terminology; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1611-0
Electronic_ISBN :
978-1-4244-1611-0
Type :
conf
DOI :
10.1109/NLPKE.2007.4368054
Filename :
4368054
Link To Document :
بازگشت