Title :
Infrastructure for Annotation-Driven Information Extraction from the Primary Scientific Literature: Principles and Practice
Author :
Burns, Gully ; Feng, Donghui ; Ingulfsen, Tommy ; Hovy, Eduard
Author_Institution :
Univ. of Southern California, Los Angeles
Abstract :
We present an informatics infrastructure for biocuration, based on a combination of techniques from information extraction (IE) and knowledge engineering (KE). We describe the high-level design of this infrastructure which we base on the concept of ´experimental type´. Here, we treat each experiment as a specific type of knowledge statement determined by the experiment´s design. We provide a preliminary, detailed example of the use of the infrastructure to support the construction of a database pertaining to neuroanatomical tract-tracing experiments. This work generalizes to provide support for other experimental types and could be used to make biocuration efforts more efficient. We also discuss how the process of annotating text for IE directly supports designing schema for databases. We envisage how this architecture could support small-scale, laboratory-centric knowledge bases that each support service-oriented functionality.
Keywords :
biology computing; distributed databases; information retrieval; knowledge engineering; annotation-driven information extraction; biocuration; database schema design; informatics infrastructure; knowledge engineering; knowledge statement; laboratory-centric knowledge base; neuroanatomical tract-tracing experiment; primary scientific literature; service-oriented functionality; Bioinformatics; Biological system modeling; Data mining; Databases; Informatics; Knowledge engineering; Laboratories; Large-scale systems; Organisms; Supervised learning;
Conference_Titel :
Services, 2007 IEEE Congress on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
978-0-7695-2926-4
DOI :
10.1109/SERVICES.2007.34