Title :
Query translation for CLIR: EWC vs. Google Translate
Author :
Klyuev, V. ; Haralambous, Y.
Author_Institution :
Univ. of Aizu, Aizu-Wakamatsu, Japan
Abstract :
A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate terms, the idea of the shortest path on an oriented graph is applied. Nodes of this graph symbolize word senses and edges connect nodes representing neighboring Japanese terms. The EWC semantic relatedness measure is used to select the most related meanings for the translation results. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The proposed technique is tested on the NTCIR data collection. Queries generated by Google Translate were used to evaluate the quality of translation.
Keywords :
dictionaries; language translation; query processing; search engines; CLIR task; EWC semantic relatedness measure; English senses; Google translate; Japanese queries; Mecab system; NTCIR data collection; Wikipedia-based explicit semantic analysis measure; cross-language information retrieval; mixed collocation index; neighboring Japanese terms; online dictionary SPACEALC; oriented graph; query translation; search engine queries; shortest path; word senses; Electronic publishing; Encyclopedias; Google; Information retrieval; Internet; Semantics;
Conference_Titel :
Information Science and Technology (ICIST), 2012 International Conference on
Conference_Location :
Hubei
Print_ISBN :
978-1-4577-0343-0
DOI :
10.1109/ICIST.2012.6221738