• Title of article

    An evaluation framework for cross-lingual link discovery

  • Author/Authors

    Ling-Xiang Tang، نويسنده , , Shlomo Geva، نويسنده , , Andrew Trotman، نويسنده , , Yue Xu، نويسنده , , Kelly Y. Itakura، نويسنده ,

  • Issue Information
    دوماهنامه با شماره پیاپی سال 2014
  • Pages
    23
  • From page
    1
  • To page
    23
  • Abstract
    Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with Wikipedia. Techniques for identifying new and topically relevant cross-lingual links are a current topic of interest at NTCIR where the CrossLink task has been running since the 2011 NTCIR-9. This paper presents the evaluation framework for benchmarking algorithms for cross-lingual link discovery evaluated in the context of NTCIR-9. This framework includes topics, document collections, assessments, metrics, and a toolkit for pooling, assessment, and evaluation. The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from Wikipedia itself. Using this framework we show that manual assessment is more robust than automatic assessment in the context of cross-lingual link discovery.
  • Keywords
    Cross-lingual link discovery , Evaluation framework , Wikipedia , assessment , Validation , Evaluation metrics
  • Journal title
    Information Processing and Management
  • Serial Year
    2014
  • Journal title
    Information Processing and Management
  • Record number

    1229471