Title :
Supporting high performance bioinformatics flat-file data processing using indices
Author :
Zhang, Xuan ; Agrawal, Gagan
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
Abstract :
As an essential part of in vitro analysis, biological database query has become more and more important in the research process. A few challenges that are specific to bioinformatics applications are data heterogeneity, large data volume and exponential data growth, constant appearance of new data types and data formats. We have developed an integration system that processes data in their flat file formats. Its advantages include the reduction of overhead and programming efforts. In the paper, we discuss the usage of indicing techniques on top of this flat file query system. Besides the advantage of processing flat files directly, the system also improves its performance and functionality by using indexes. Experiments based on real life queries are used to test the integration system.
Keywords :
biology computing; query processing; biological database query; data heterogeneity; flat file data processing; flat file query system; high performance bioinformatics; indexes; Bioinformatics; Computer science; Data analysis; Data engineering; Data processing; Databases; In vitro; Indexing; Performance analysis; Sequences;
Conference_Titel :
Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-1693-6
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2008.4536176