DocumentCode
3119849
Title
Infrastructure for Annotation-Driven Information Extraction from the Primary Scientific Literature: Principles and Practice
Author
Burns, Gully ; Feng, Donghui ; Ingulfsen, Tommy ; Hovy, Eduard
Author_Institution
Univ. of Southern California, Los Angeles
fYear
2007
fDate
9-13 July 2007
Firstpage
122
Lastpage
129
Abstract
We present an informatics infrastructure for biocuration, based on a combination of techniques from information extraction (IE) and knowledge engineering (KE). We describe the high-level design of this infrastructure which we base on the concept of ´experimental type´. Here, we treat each experiment as a specific type of knowledge statement determined by the experiment´s design. We provide a preliminary, detailed example of the use of the infrastructure to support the construction of a database pertaining to neuroanatomical tract-tracing experiments. This work generalizes to provide support for other experimental types and could be used to make biocuration efforts more efficient. We also discuss how the process of annotating text for IE directly supports designing schema for databases. We envisage how this architecture could support small-scale, laboratory-centric knowledge bases that each support service-oriented functionality.
Keywords
biology computing; distributed databases; information retrieval; knowledge engineering; annotation-driven information extraction; biocuration; database schema design; informatics infrastructure; knowledge engineering; knowledge statement; laboratory-centric knowledge base; neuroanatomical tract-tracing experiment; primary scientific literature; service-oriented functionality; Bioinformatics; Biological system modeling; Data mining; Databases; Informatics; Knowledge engineering; Laboratories; Large-scale systems; Organisms; Supervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Services, 2007 IEEE Congress on
Conference_Location
Salt Lake City, UT
Print_ISBN
978-0-7695-2926-4
Type
conf
DOI
10.1109/SERVICES.2007.34
Filename
4278787
Link To Document