DocumentCode :
3767511
Title :
Bilingual lexicon extraction using locally weighted linear regression from comparable corpora
Author :
Chunyue Zhang; Tiejun Zhao
Author_Institution :
School of Computer Science and Technology, Harbin Institute of Technology, China
fYear :
2015
Firstpage :
13
Lastpage :
16
Abstract :
Recently a simple linear transformation with word embedding has been found to be highly effective to extract a bilingual lexicon from comparable corpora. However, it is easy to underfit for transforming all the words just using a single transformation matrix. This paper proposes a simple non-parameter based solution using locally weighted linear regression (LWR) which forces that the closer words in the training lexicon with the target word should be more important for estimating the objective function for the regression. The experimental results confirm that the proposed solution can achieve a 36.7% relative improvement at Top-1 over the baseline approach on the English-to-Chinese bilingual lexicon extraction task.
Keywords :
"Linear regression","Computational modeling","Europe"
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2015 International Conference on
Print_ISBN :
978-1-4673-9595-3
Type :
conf
DOI :
10.1109/IALP.2015.7451520
Filename :
7451520
Link To Document :
بازگشت