Title of article :
An evaluation framework for cross-lingual link discovery
Author/Authors :
Ling-Xiang Tang، نويسنده , , Shlomo Geva، نويسنده , , Andrew Trotman، نويسنده , , Yue Xu، نويسنده , , Kelly Y. Itakura، نويسنده ,
Issue Information :
دوماهنامه با شماره پیاپی سال 2014
Abstract :
Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with Wikipedia. Techniques for identifying new and topically relevant cross-lingual links are a current topic of interest at NTCIR where the CrossLink task has been running since the 2011 NTCIR-9. This paper presents the evaluation framework for benchmarking algorithms for cross-lingual link discovery evaluated in the context of NTCIR-9.
This framework includes topics, document collections, assessments, metrics, and a toolkit for pooling, assessment, and evaluation. The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from Wikipedia itself. Using this framework we show that manual assessment is more robust than automatic assessment in the context of cross-lingual link discovery.
Keywords :
Cross-lingual link discovery , Evaluation framework , Wikipedia , assessment , Validation , Evaluation metrics
Journal title :
Information Processing and Management
Journal title :
Information Processing and Management