DocumentCode
2373007
Title
Query translation for CLIR: EWC vs. Google Translate
Author
Klyuev, V. ; Haralambous, Y.
Author_Institution
Univ. of Aizu, Aizu-Wakamatsu, Japan
fYear
2012
fDate
23-25 March 2012
Firstpage
707
Lastpage
711
Abstract
A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate terms, the idea of the shortest path on an oriented graph is applied. Nodes of this graph symbolize word senses and edges connect nodes representing neighboring Japanese terms. The EWC semantic relatedness measure is used to select the most related meanings for the translation results. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The proposed technique is tested on the NTCIR data collection. Queries generated by Google Translate were used to evaluate the quality of translation.
Keywords
dictionaries; language translation; query processing; search engines; CLIR task; EWC semantic relatedness measure; English senses; Google translate; Japanese queries; Mecab system; NTCIR data collection; Wikipedia-based explicit semantic analysis measure; cross-language information retrieval; mixed collocation index; neighboring Japanese terms; online dictionary SPACEALC; oriented graph; query translation; search engine queries; shortest path; word senses; Electronic publishing; Encyclopedias; Google; Information retrieval; Internet; Semantics;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Technology (ICIST), 2012 International Conference on
Conference_Location
Hubei
Print_ISBN
978-1-4577-0343-0
Type
conf
DOI
10.1109/ICIST.2012.6221738
Filename
6221738
Link To Document