Title :
Resource capability discovery and description management system for bioinformatics Data and service Integration - an experiment with gene regulatory networks
Author_Institution :
Dept. of Comput. Sci., Integration Inf. Lab., Detroit, MI
Abstract :
Traditional legacy HTML based web sites/ page can be thought of as web services because the dynamic web pages can take user input argument via web forms and response to user query. The ability of agents and services to automatically locate and interact with unknown partners is a goal for Web based Data Integration system. This ldquoserendipitous interoperabilityrdquo is hindered by the lack of an explicit means of describing what web pages are able to do and in order to do it what input it takes and what output it produces, that is what is their capabilities [1]. The tremendous success of the WWW is countervailed by the efforts needed to search and find relevant information. For tabular structures embedded in HTML documents, typical keyword or link-analysis based search fails. The next phase envisioned for the WWW is automatic ad-hoc interaction between intelligent agents, web services, databases and semantic web enabled applications. A large amount of information available on the Web is formatted in HTML tables, which are mainly presentation oriented and are not suited for database applications. As a result, how to capture information in HTML tables semantically and integrate relevant information is a challenge. We are envisioning another layer of web abstraction where user can query intra web document table like structure. Our prototype application is based on WebFusion and an ad hoc query language BioFlow [2], [3], [4], [5], [6] a software agent that can simulate a person interacting with web search forms and extracting information from the resulting pages by means of an API. We need to develop a framework which is able to query search web forms and the web page tables in a SQL way. In this context we also report a Java based implementation for integrating Flybase and AlignACE site.
Keywords :
SQL; Web services; Web sites; bioinformatics; genetics; hypermedia markup languages; open systems; query processing; semantic Web; API; AlignACE site; BioFlow; Flybase site; HTML documents; Java based implementation; SQL; Web abstraction; Web based data integration system; Web document; Web forms; Web services; Web sites; WebFusion; ad hoc query language; automatic ad hoc interaction; bioinformatics data; databases; description management system; dynamic Web pages; gene regulatory networks; intelligent agents; keyword analysis based search; link analysis based search; resource capability discovery; semantic Web; serendipitous interoperability; service integration; software agent; table like structure; tabular structures; Bioinformatics; Databases; HTML; Intelligent agent; Resource management; Semantic Web; Software prototyping; Web pages; Web services; World Wide Web; HTML forms; Hidden Web; Intelligent Wrapper; Ontology generation; Semantic Web; Table modeling; Table structure; Web Automation; Web Data Integration; Web Information Extraction; Web mining; extraction ontology;
Conference_Titel :
Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on
Conference_Location :
Khulna
Print_ISBN :
978-1-4244-2135-0
Electronic_ISBN :
978-1-4244-2136-7
DOI :
10.1109/ICCITECHN.2008.4802991