Title :
Parallelization of the SSAKE Genomics Application
Author :
D´Agostino, Daniele ; Merelli, Ivan ; Warren, Renè ; Guffanti, Alessandro ; Milanesi, Luciano ; Clematis, Andrea
Author_Institution :
Inst. for Appl. Math. & Inf. Technol., Nat. Res. Council of Italy, Genoa, Italy
Abstract :
Present advances in sequencing technology make possible to generate large amounts of data in short time. The problem is that fragments produced by these high-throughput methods are much shorter than in traditional Sanger sequencing, and this makes stringent the issue of exploiting an efficient sequence assembly algorithm. While two common approaches are actually applied to genome assembly, overlap graph and sequence hashing, the latter allows aggressive assembling of millions of short fragments with a reasonable memory and computational cost. In particular SSAKE, one of the first and more popular implementation of the sequence hashing algorithm, was designed to leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets. In this paper we present a parallel version of this tool that enables a fast, lightweight and scalable solution for modern genome assembly.
Keywords :
biocomputing; genomics; molecular biophysics; sequences; SSAKE genomics; Sanger sequencing; genome assembly; overlap graph; sequence assembly algorithm; sequence hashing; sequencing technology; Arrays; Assembly; Bioinformatics; DNA; Genomics; Memory management; parallel de novo assembly; parallel ssake;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2011 19th Euromicro International Conference on
Conference_Location :
Ayia Napa
Print_ISBN :
978-1-4244-9682-2
DOI :
10.1109/PDP.2011.40