Title :
Cluster and Grid Based Classification of Transposable Elements in Eukaryotic Genomes
Author :
Ranganathan, Nirmal ; Feschotte, Cédric ; Levine, David
Author_Institution :
Dept. of Comput. Sci. & Eng., Texas Univ., Arlington, TX
Abstract :
In the last few years many computer and laboratory improvements in the production and analysis of DNA sequences have made possible the complete sequencing of whole genomes. This provides a wealth of raw genomes that needs to be processed and annotated. All eukaryotic genomes examined and published thus far contain repetitive DNA. The amount of repetitive DNA in any specific eukaryotic genome ranges from 5% to 80%. These repeats consist mainly of transposable elements and tandem repeats which need to be identified, classified and annotated in order to sequence and annotate an entire genome. This paper discusses the design and implementation of a distributed cluster and grid based workflow to classify transposable elements. We show experimental results for representative species genomes on a cluster and grid. The performance and results of the workflow with regard to turnaround time, scalability, load balancing, resource utilization and fault tolerance are shown and discussed
Keywords :
DNA; biology computing; fault tolerant computing; genetics; grid computing; resource allocation; workstation clusters; bioinformatics; cluster based classification; distributed cluster; distributed workflow; eukaryotic genomes; fault tolerance; grid based classification; load balancing; resource utilization; transposable elements; Bioinformatics; DNA computing; Fault tolerance; Genomics; Laboratories; Load management; Production; Resource management; Scalability; Sequences; In cluster; bioinformatics.; distributed workflow; elements; transposable;
Conference_Titel :
Cluster Computing and the Grid, 2006. CCGRID 06. Sixth IEEE International Symposium on
Conference_Location :
Singapore
Print_ISBN :
0-7695-2585-7
DOI :
10.1109/CCGRID.2006.1630938