DocumentCode :
2355862
Title :
High-throughput computation of pairwise sequence similarities for multiple genome comparisons using ScalaBLAST
Author :
Shah, Anuj R. ; Markowitz, Victor M. ; Oehmen, Christopher S.
Author_Institution :
Pacific Northwest Nat. Lab., Richland
fYear :
2007
fDate :
8-9 Nov. 2007
Firstpage :
89
Lastpage :
91
Abstract :
Genome sequence comparisons of exponentially growing data sets form the foundation for the comparative analysis tools provided by community biological data resources such as the integrated microbial genome (IMG) system at the joint genome institute (JGI). For a genome sequencing center to provide multiple-genome comparison capabilities, it must keep pace with exponentially growing collection of sequence data, both from its own genomes, and from public genomes. We present an example of how ScalaBLAST, a high-throughput sequence analysis program, harnesses increasingly critical high-performance computing to perform sequence analysis, enabling, for example, all vs. all BLAST runs across 2 million protein sequences within a day using thousands of processors as opposed to conventional comparison methods that would take years to complete.
Keywords :
biology computing; genetics; molecular biophysics; molecular configurations; proteins; ScalaBLAST; genome sequencing; high-throughput sequence analysis program; multiple genome comparisons; pairwise sequence similarities; protein sequences; Bioinformatics; Biology computing; Computational biology; Data analysis; Genomics; Performance analysis; Proteins; Resource management; Sequences; Technology management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Life Science Systems and Applications Workshop, 2007. LISA 2007. IEEE/NIH
Conference_Location :
Bethesda, MD
Print_ISBN :
978-1-4244-1813-8
Electronic_ISBN :
978-1-4244-1813-8
Type :
conf
DOI :
10.1109/LSSA.2007.4400891
Filename :
4400891
Link To Document :
بازگشت