Title :
Automated Structure Extraction and XML Conversion of Life Science Database Flat Files
Author :
Philippi, Stephan ; Köhler, Jacob
Author_Institution :
Univ. of Koblenz
Abstract :
In the light of the increasing number of biological databases, their integration is a fundamental prerequisite for answering complex biological questions. Database integration, therefore, is an important area of research in bioinformatics. Since most of the publicly available life science databases are still exclusively exchanged by means of proprietary flat files, database integration requires parsers for very different flat file formats. Unfortunately, the development and maintenance of database specific flat file parsers is a nontrivial and time-consuming task, which takes considerable effort in large-scale integration scenarios. This paper introduces heuristically based concepts for automatic structure extraction from life science database flat files. On the basis of these concepts the FlatEx prototype is developed for the automatic conversion of flat files into XML representations
Keywords :
XML; biology computing; data structures; database management systems; electronic data interchange; scientific information systems; FlatEx prototype; XML conversion; automated structure extraction; automatic conversion; bioinformatics; biological database integration; data exchange; data transformation; database specific flat file parsers; life science database flat files; Bioinformatics; Biology; Data mining; Data structures; Jacobian matrices; Large scale integration; Light scattering; Prototypes; Spatial databases; XML; Data exchange; data integration; data transformation; database flat files; structure extraction;
Journal_Title :
Information Technology in Biomedicine, IEEE Transactions on
DOI :
10.1109/TITB.2006.875653