Title :
Multiway Clustering for Creating Biomedical Term Sets
Author :
Kandylas, Vasileios ; Ungar, Lyle ; Sandler, Ted ; Jensen, Shane
Author_Institution :
Comput. & Inf. Sci., Univ. of Pennsylvania, Philadelphia, PA
Abstract :
We present an EM-based clustering method that can be used for constructing or augmenting ontologies such as MeSH. Our algorithm simultaneously clusters verbs and nouns using both verb-noun and noun-noun co-occurrence pairs. This strategy provides greater coverage of words than using either set of pairs alone, since not all words appear in both datasets. We demonstrate it on data extracted from Medline and evaluate the results using MeSH and Wordnet.
Keywords :
bioinformatics; ontologies (artificial intelligence); pattern clustering; statistical analysis; word processing; EM-based clustering method; MeSH; Wordnet; multiway clustering; ontologies; Bioinformatics; Biomedical computing; Clustering algorithms; Clustering methods; Data mining; Information science; Mutual information; Natural language processing; Ontologies; Statistics;
Conference_Titel :
Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-0-7695-3452-7
DOI :
10.1109/BIBM.2008.25