DocumentCode
676771
Title
A dynamic hashing approach to build the de bruijn graph for genome assembly
Author
Kun Zhao ; Weiguo Liu ; Voss, Gerrit ; Muller-Wittig, Wolfgang
Author_Institution
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China
fYear
2013
fDate
22-25 Oct. 2013
Firstpage
1
Lastpage
4
Abstract
The development of next-generation sequencing technologies has revolutionized the genome research and given rise to the explosive increase of DNA sequencing throughput. However, due to the continuing explosive growth of short-read database, these technologies face the challenges of short overlap and high throughput. The de Bruijn graph is particularly suitable for short-read assemblies, and its advantage is that the graph size will not be affected by the high redundancy of deep read coverage. With this character, the fragment assembly is cast as finding a path visiting every edge in the graph exactly once. In this paper, we present a new method to accelerate the genome assembly procedure. We have used a distributed dynamic hashing approach to construct the de Bruijn graph from short-read data. Evaluations using three paired-end datasets show that, our method outperforms previous parallel and distributed assemblers on a CPU cluster system.
Keywords
DNA; biology computing; distributed processing; file organisation; genomics; graph theory; CPU cluster system; DNA sequencing throughput; de Bruijn graph; deep read coverage; distributed dynamic hashing approach; fragment assembly; genome assembly procedure; next-generation sequencing technologies; short-read assemblies; short-read database; Acceleration; Assembly; Bioinformatics; Couplings; Genomics; Sequential analysis; Vectors; De Bruijn graph; Dynamic hashing; Genome assembly;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON 2013 - 2013 IEEE Region 10 Conference (31194)
Conference_Location
Xi´an
ISSN
2159-3442
Print_ISBN
978-1-4799-2825-5
Type
conf
DOI
10.1109/TENCON.2013.6719008
Filename
6719008
Link To Document