DocumentCode
2982058
Title
Analysing the evolution of the NCI Thesaurus
Author
Gonç, Rafael S. ; Parsia, Bijan ; Sattler, Uli
Author_Institution
Sch. of Comput. Sci., Univ. of Manchester, Manchester, UK
fYear
2011
fDate
27-30 June 2011
Firstpage
1
Lastpage
6
Abstract
The National Cancer Institute (NCI) Thesaurus (NCIt) is a biomedical ontology which has been developed for over a decade. Nearly every month from 2003 through 2011, the NCI has published an updated version of the NCIt to the Web as an OWL ontology (as well as in other formats). We collected all 88 OWL versions of the NCIt available and conducted a cross-sectional study on this corpus to investigate and characterize the evolution of the NCIt. In particular, we gathered and analysed various axiom and entity statistics, and carried out a reasoner performance test over the corpus. Additionally, we extracted two complete sets of pairwise, consecutive diffs: the first set was generated by a purely syntactic difference analysis (based on OWL´s notion of “structural equivalence”); for the second set, we also checked whether the additions or removals changed the set of entailments between versions. We discovered a high level of “merely syntactic” removals and additions. We develop a categorization of such changes based on a heuristic inference of the impact of the change. As a result, not only do we get a rich, purely analytic characterization of the change history of the NCIt, but also we generate a realistic test corpus for incremental classification.
Keywords
inference mechanisms; knowledge representation languages; ontologies (artificial intelligence); thesauri; NCI thesaurus; OWL ontology; Web; biomedical ontology; heuristic inference; incremental classification; national cancer institute;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer-Based Medical Systems (CBMS), 2011 24th International Symposium on
Conference_Location
Bristol
ISSN
1063-7125
Print_ISBN
978-1-4577-1189-3
Type
conf
DOI
10.1109/CBMS.2011.5999163
Filename
5999163
Link To Document