Title :
Integrating life sciences data-with a little Garlic
Author :
Haas, Laura M. ; Kodali, Prasad ; Rice, Julia E. ; Schwarz, Peter M. ; Swope, William C.
Author_Institution :
IBM Almaden Res. Center, San Jose, CA, USA
Abstract :
Vast amounts of life sciences data today reside in specialized data sources, with specialized query processing capabilities. Data from one source must often be combined with data from other sources to give users the information they desire. Database middleware systems such as Garlic allow users to combine data from multiple sources in a single query. Garlic provides the user with a virtual database to which they can pose arbitrarily complex queries, though the actual data needed to answer the query may be stored in several different sources, and those sources may not even possess all the functionality needed to answer such a query themselves. The Garlic technology, as incorporated in IBM´s DB2 product, forms the basis of the DiscoveryLink service offering for the life sciences industry. We describe the DiscoveryLink offering, focusing on two key contributions of Garlic, the wrapper architecture and the query optimizer, and illustrate how it can be used to integrate life sciences data from heterogeneous data sources
Keywords :
biology computing; client-server systems; distributed databases; query processing; scientific information systems; DiscoveryLink service offering; Garlic; IBM DB2; arbitrarily complex queries; database middleware system; heterogeneous data sources; life sciences data integration; query optimizer; specialized data sources; specialized query processing capabilities; virtual database; wrapper architecture; Bioinformatics; Content based retrieval; Cost function; Data mining; Databases; Encapsulation; Genomics; Information retrieval; Middleware; Query processing;
Conference_Titel :
Bio-Informatics and Biomedical Engineering, 2000. Proceedings. IEEE International Symposium on
Conference_Location :
Arlington, VA
Print_ISBN :
0-7695-0862-6
DOI :
10.1109/BIBE.2000.889583