• DocumentCode
    2912
  • Title

    Named Entity Disambiguation over Texts Written in the Portuguese or Spanish Languages

  • Author

    Santos, Joao Tiago Luis ; Anastacio, Ivo Miguel ; Martins, Bruno Emanuel

  • Author_Institution
    Inst. Super. Tecnico e INESC-ID, Univ. de Lisboa (IST/UL), Lisbon, Portugal
  • Volume
    13
  • Issue
    3
  • fYear
    2015
  • fDate
    Mar-15
  • Firstpage
    856
  • Lastpage
    862
  • Abstract
    This article addresses the problem of disambiguating named entities, in text documents, towards entries in a knowledge base like Wikipedia. The proposed approach uses supervised learning to sort candidate knowledge base entries for each entity mentioned in a text, and then to classify the entry ranked in the first position as either the correct disambiguation or not. We present results with Portuguese and Spanish texts for a wide range of models and configuration options. Our experiments attest to the effectiveness of supervised learning methods in this specific task, showing that out-of-the-box algorithms and relatively simple features can achieve a high accuracy.
  • Keywords
    Web sites; knowledge based systems; learning (artificial intelligence); natural language processing; text analysis; Portuguese language; Portuguese text; Spanish language; Spanish text; knowledge base entry; knowledge base like Wikipedia; named entity disambiguation; out-of-the-box algorithm; supervised learning method; text document; Abstracts; Electronic publishing; Encyclopedias; Google; Knowledge based systems; Supervised learning; Information Extraction; Named Entity Disambiguation; Supervised Machine Learning;
  • fLanguage
    English
  • Journal_Title
    Latin America Transactions, IEEE (Revista IEEE America Latina)
  • Publisher
    ieee
  • ISSN
    1548-0992
  • Type

    jour

  • DOI
    10.1109/TLA.2015.7069115
  • Filename
    7069115