DocumentCode :
3256261
Title :
Comparison of Two Methods for Finding Biomedical Categories in Medline
Author :
Yeganova, Lana ; Kim, Won ; Comeau, Donald C. ; Wilbur, W. John
Author_Institution :
Nat. Libr. of Med., Nat. Inst. of Health, Bethesda, MD, USA
Volume :
2
fYear :
2011
fDate :
18-21 Dec. 2011
Firstpage :
96
Lastpage :
99
Abstract :
In this paper we describe and compare two methods for automatically learning meaningful biomedical categories in Medline®. The first approach is a simple statistical method that uses part-of-speech and frequency information to extract a list of frequent headwords from noun phrases in Medline. The second method implements an alignment-based technique to learn frequent generic patterns that indicate a hyponymy/hypernymy relationship between a pair of noun phrases. We then apply these patterns to Medline to collect frequent hypernyms, potential biomedical categories. We study and compare these two alternative sets of terms to identify semantic categories in Medline. Our method is completely data-driven.
Keywords :
document handling; information retrieval; learning (artificial intelligence); medical computing; statistical analysis; biomedical categories; frequency information; frequent headwords; frequent hypernyms; medical literature analysis and retrieval system online; medline biomedical categories; noun phrases; part-of-speech information; semantic categories; statistical method; two method comparison; Diseases; Feature extraction; Ontologies; Semantics; Statistical analysis; Unified modeling language; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
Type :
conf
DOI :
10.1109/ICMLA.2011.50
Filename :
6147055
Link To Document :
بازگشت