DocumentCode
2859653
Title
Mining the Web to Create a Language Model for Mapping between English Names and Phrases and Japanese
Author
Grefenstette, Gregory ; Qu, Yan ; Evans, David A.
Author_Institution
LIC2M/LIST/CEA, France
fYear
2004
fDate
20-24 Sept. 2004
Firstpage
110
Lastpage
116
Abstract
The Web provides the largest, exploitable collection of language use. If we can mine the Web to build abstract models of language use, these models may have many applications. Here we present one example of using the implicit intelligence of language use to solve an important problem for machine translation programs and cross-lingual applications. This problem involves the translation of words written in katakana characters in Japanese. In this paper, we describe techniques of discovering katakana transliteration of English names and of finding English translations of multiword katakana sequences using implicit language models of English and Japanese found on the Web. These techniques were evaluated against human-constructed English-katakana glosses.
Keywords
Costs; Data mining; Dictionaries; Gold; Information retrieval; Large-scale systems; Machine intelligence; Natural language processing; Natural languages; Writing;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence, 2004. WI 2004. Proceedings. IEEE/WIC/ACM International Conference on
Print_ISBN
0-7695-2100-2
Type
conf
DOI
10.1109/WI.2004.10042
Filename
1410791
Link To Document