DocumentCode :
2859653
Title :
Mining the Web to Create a Language Model for Mapping between English Names and Phrases and Japanese
Author :
Grefenstette, Gregory ; Qu, Yan ; Evans, David A.
Author_Institution :
LIC2M/LIST/CEA, France
fYear :
2004
fDate :
20-24 Sept. 2004
Firstpage :
110
Lastpage :
116
Abstract :
The Web provides the largest, exploitable collection of language use. If we can mine the Web to build abstract models of language use, these models may have many applications. Here we present one example of using the implicit intelligence of language use to solve an important problem for machine translation programs and cross-lingual applications. This problem involves the translation of words written in katakana characters in Japanese. In this paper, we describe techniques of discovering katakana transliteration of English names and of finding English translations of multiword katakana sequences using implicit language models of English and Japanese found on the Web. These techniques were evaluated against human-constructed English-katakana glosses.
Keywords :
Costs; Data mining; Dictionaries; Gold; Information retrieval; Large-scale systems; Machine intelligence; Natural language processing; Natural languages; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence, 2004. WI 2004. Proceedings. IEEE/WIC/ACM International Conference on
Print_ISBN :
0-7695-2100-2
Type :
conf
DOI :
10.1109/WI.2004.10042
Filename :
1410791
Link To Document :
بازگشت