Author_Institution :
Dept. of Comput. Sci., Wayne State Univ., Detroit, MI
Abstract :
Scientific workflows in life sciences are usually complex, and use many online databases, analysis tools, publication repositories and customized computation intensive desktop software in a coherent manner to respond to investigative queries. These investigative queries are generally ad hoc, ill-formed, and often, used only once to test a single hypothesis. In such cases, developing customized workflows becomes a major undertaking, rendering the effort truly expensive, prohibitive and resource intensive. Such high development costs often act as deterrents to many interesting queries and promising on-time scientific discoveries. In this paper, we introduce a new query language that combines workflow features for scientific applications, called BioFlow, that exploits many recent developments in internet communication, databases, wrapper and mediator technologies, ontology, and data integration. BioFlow is a declarative language that abstracts these features to help hide most procedural aspects of mediation, data integration, communication protocols, data extraction and workflow details. We will demonstrate that fairly complex workflows can be effortlessly and declaratively expressed in BioFlow in an ad hoc fashion at minimal costs. We also report a prototype implementation of BioFlow in Windows VB .NET that includes most of its powerful and representative features as proof of feasibility of our proposal.
Keywords :
Internet; biology computing; information retrieval; ontologies (artificial intelligence); query languages; scientific information systems; workflow management software; BioFlow; Internet communication; Web-based declarative workflow language; Windows VB.NET; communication protocols; customized computation intensive desktop software; data extraction; data integration; declarative language; life sciences; online databases; ontology; publication repository; query language; scientific workflows; workflow details; Abstracts; Costs; Data analysis; Database languages; Internet; Mediation; Ontologies; Software tools; Spatial databases; Testing;