Title :
Architecture of a mediator for a bioinformatics database federation
Author :
Kemp, Graham J L ; Angelopoulos, Nicos ; Gray, Peter M D
Author_Institution :
Dept. of Comput. Sci., Univ. of Aberdeen, UK
fDate :
6/1/2002 12:00:00 AM
Abstract :
Developments in our ability to integrate and analyze data held in existing heterogeneous data resources can lead to an increase in our understanding of biological function at all levels. However, supporting ad hoc queries across multiple data resources and correlating data retrieved from these is still difficult. To address this, we are building a mediator based on the functional data model database, P/FDM, which integrates access to heterogeneous distributed biological databases. Our architecture makes use of the existing search capabilities and indexes of the underlying databases, without infringing on their autonomy. Central to our design philosophy is the use of schemas. We have adopted a federated architecture with a five-level schema, arising from the use of the ANSI-SPARC three-level schema to describe both the existing autonomous data resources and the mediator itself. We describe the use of mapping functions and list comprehensions in query splitting, producing execution plans, code generation, and result fusion. We give an example of cross-database querying involving data held locally in P/FDM systems and external data in SRS.
Keywords :
biology computing; data models; distributed databases; program compilers; query processing; scientific information systems; ANSI-SPARC three-level schema; P/FDM functional data model database; ad hoc queries; autonomous data resources; bioinformatics database federation; biological function; code generation; cross-database querying; data analysis; data correlation; data integration; execution plans; five-level schema; heterogeneous data resources; heterogeneous distributed biological databases; indexes; list comprehensions; mapping functions; mediator architecture; multiple data resources; query splitting; result fusion; search; Bioinformatics; Biological information theory; Biology; Computer architecture; Data analysis; Data models; Database systems; Distributed databases; Information retrieval; Internet; Algorithms; Artificial Intelligence; Computational Biology; Computer Communication Networks; Database Management Systems; Databases, Factual; Decision Support Techniques; Feasibility Studies; Information Storage and Retrieval; Internet;
Journal_Title :
Information Technology in Biomedicine, IEEE Transactions on
DOI :
10.1109/TITB.2002.1006298