• DocumentCode
    2664311
  • Title

    Word sense disambiguation combining conceptual distance, frequency and gloss

  • Author

    Rosso, Paolo ; Masulli, Francesco ; Buscaldi, Davide

  • Author_Institution
    Dept. of Sistemas Informaticos y Computacion, Polytech. Univ. of Valencia, Spain
  • fYear
    2003
  • fDate
    26-29 Oct. 2003
  • Firstpage
    120
  • Lastpage
    125
  • Abstract
    Word sense disambiguation (WSD) is the process of assigning a meaning to a word based on the context in which it occurs. The absence of sense tagged training data is a real problem for the word sense disambiguation task. We present a method for the resolution of lexical ambiguity which relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a conceptual density formula developed for this purpose. The formula we propose, is a generalised form of the Agirre-Rigau conceptual density measure in which many (parameterised) refinements were introduced and an exhaustive evaluation of all meaningful combinations was performed. This fully automatic method requires no hand coding of lexical entries, hand tagging of text nor any kind of training process. The results of the experiment were automatically evaluated against SemCor, the sense-tagged version of the Brown Corpus.
  • Keywords
    computational linguistics; natural languages; vocabulary; Agirre-Rigau conceptual density measure; Brown Corpus; SemCor; WordNet ontology; conceptual density formula; lexical ambiguity; noun taxonomy; sense tagged training data; word sense disambiguation; Databases; Density measurement; Frequency; Information resources; Length measurement; Ontologies; Performance evaluation; Tagging; Taxonomy; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
  • Conference_Location
    Beijing, China
  • Print_ISBN
    0-7803-7902-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2003.1275880
  • Filename
    1275880