Title :
Preliminary Results for Biomedical Word Sense Disambiguation Based on Semantic Clustering
Author :
Martín-Wanton, Tamara ; Berlanga-Llavori, Rafael ; Jimeno-Yepes, Antonio
Author_Institution :
Univ. Jaume I, Castellon, Spain
fDate :
Aug. 29 2011-Sept. 2 2011
Abstract :
Word sense disambiguation (WSD) is an intermediate task within information retrieval and information extraction, attempting to select the proper sense of ambiguous words. Due to the scarcity of training data, knowledge-based and knowledge-lean methods receive attention as disambiguation methods. Knowledge-based methods compare the context of the ambiguous word to the information available in the terminological resource, but their main purpose is not only word sense disambiguation. Knowledge-lean unsupervised methods rely on terms distribution instead of a resource enumerating the possible senses but might be inappropriate when there is a requirement to commit to a terminological resource. In this work, we rely on a Knowledge Resource (KR) which provides both an inventory of concepts and their lexical information. Our aim is to design scalable unsupervised WSD methods for the semantic annotation of large biomedical corpora. More specifically, we present a clustering-based method that takes profit from the KR information encoded in form of kernels. Prelimanary results are compared to state-of-the-art methods for unsupervised WSD.
Keywords :
information retrieval; medical computing; pattern clustering; unsupervised learning; biomedical word sense disambiguation; information extraction; information retrieval; knowledge resource; knowledge-based method; knowledge-lean unsupervised methods; semantic clustering; Biomedical measurements; Context; Kernel; Knowledge based systems; Semantics; USA Councils; Unified modeling language; clustering; word sense disambiguation;
Conference_Titel :
Database and Expert Systems Applications (DEXA), 2011 22nd International Workshop on
Conference_Location :
Toulouse
Print_ISBN :
978-1-4577-0982-1
DOI :
10.1109/DEXA.2011.66