مرکز منطقه ای اطلاع رساني علوم و فناوري - An evaluation framework for cross-lingual link discovery

Title of article :

An evaluation framework for cross-lingual link discovery

Author/Authors :

Ling-Xiang Tang، نويسنده , , Shlomo Geva، نويسنده , , Andrew Trotman، نويسنده , , Yue Xu، نويسنده , , Kelly Y. Itakura، نويسنده ,

Issue Information :

دوماهنامه با شماره پیاپی سال 2014

Pages :

From page :

To page :

Abstract :

Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with Wikipedia. Techniques for identifying new and topically relevant cross-lingual links are a current topic of interest at NTCIR where the CrossLink task has been running since the 2011 NTCIR-9. This paper presents the evaluation framework for benchmarking algorithms for cross-lingual link discovery evaluated in the context of NTCIR-9. This framework includes topics, document collections, assessments, metrics, and a toolkit for pooling, assessment, and evaluation. The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from Wikipedia itself. Using this framework we show that manual assessment is more robust than automatic assessment in the context of cross-lingual link discovery.

Keywords :

Cross-lingual link discovery , Evaluation framework , Wikipedia , assessment , Validation , Evaluation metrics

Journal title :

Information Processing and Management

Serial Year :

2014

Journal title :

Information Processing and Management

Record number :

1229471

Link To Document :

https://search.isc.ac/dl/search/defaultta.aspx?DTC=10&DC=1229471