Title :
The use of external text data in cross-language information retrieval based on machine translation
Author_Institution :
Corp. Res. Dev. Center, Toshiba Corp., Kawasaki, Japan
Abstract :
This paper explores the use of an external (i.e. non-target) document collection in cross-language information retrieval (CLIR) based on machine translation (MT). In our CLIR and monolingual IR experiments using an external target language collection, we show that parallel pseudo-relevance feedback is comparable to collection enrichment. In our CLIR experiments using an external source language collection, we show that context-sensitive translation of pre-translation expansion terms outperforms word-by-word (or context-free) translation on average. Moreover, we show that the combination of context-sensitive translation with pseudo-relevance feedback significantly outperforms the corresponding context-free combination and the pseudo-relevance feedback component. Thus, context-sensitive translation for pre-translation expansion is probably superior to context-free translation.
Keywords :
information retrieval; language translation; relevance feedback; context-sensitive translation; cross-language information retrieval; document collection; external source language collection; external text data; machine translation; parallel pseudorelevance feedback; pseudo-relevance feedback; Information retrieval; Laboratories; MONOS devices; Natural languages; Output feedback; Research and development; Testing;
Conference_Titel :
Systems, Man and Cybernetics, 2002 IEEE International Conference on
Print_ISBN :
0-7803-7437-1
DOI :
10.1109/ICSMC.2002.1175600