DocumentCode :
2768324
Title :
Unsupervised Translation Disambiguation Based on Maximum Web Bilingual Relatedness: Web as Lexicon
Author :
Liu, Peng Yuan ; Zhao, Tie Jun
Author_Institution :
Inst. of Comput. Linguistic, Peking Univ., Beijing, China
Volume :
7
fYear :
2009
fDate :
14-16 Aug. 2009
Firstpage :
607
Lastpage :
611
Abstract :
This paper regards Web as a semantic lexicon and alleviates the problem of bilingual lexical knowledge acquiring. Based on mix-language Web page counts, four Web bilingual relatedness (WBR) measurements are built. WBR measurements are evaluated by a modified Miller-Charles´ dataset and it is found that the measurement based on point-wise mutual information achieves the best performance. Furthermore, this paper presents a fully unsupervised translation disambiguation method which selects the translation to maximize the sum of WBR between translation and all context words. By testing this disambiguation method on multilingual Chinese English lexical sample task in SemEval-2007, it is found that the WBR disambiguation model based on point-wise mutual information achieves the best performance, outperforms other previous work and gets the state-of-the-art results (Pmar = 0.451).
Keywords :
Internet; language translation; natural language processing; SemEval-2007; Web bilingual relatedness measurements; mix-language Web page counts; modified Miller-Charles´ dataset; multilingual Chinese English lexical sample task; point-wise mutual information; semantic lexicon; unsupervised translation disambiguation method; Art; Computer science; Dictionaries; Fuzzy systems; Mutual information; Natural language processing; Search engines; State estimation; Testing; Web pages; Unsupervised word sense disambiguation; Web; bilingual relatedness; semantic lexicon;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-0-7695-3735-1
Type :
conf
DOI :
10.1109/FSKD.2009.768
Filename :
5360081
Link To Document :
بازگشت