• DocumentCode
    676771
  • Title

    A dynamic hashing approach to build the de bruijn graph for genome assembly

  • Author

    Kun Zhao ; Weiguo Liu ; Voss, Gerrit ; Muller-Wittig, Wolfgang

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China
  • fYear
    2013
  • fDate
    22-25 Oct. 2013
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    The development of next-generation sequencing technologies has revolutionized the genome research and given rise to the explosive increase of DNA sequencing throughput. However, due to the continuing explosive growth of short-read database, these technologies face the challenges of short overlap and high throughput. The de Bruijn graph is particularly suitable for short-read assemblies, and its advantage is that the graph size will not be affected by the high redundancy of deep read coverage. With this character, the fragment assembly is cast as finding a path visiting every edge in the graph exactly once. In this paper, we present a new method to accelerate the genome assembly procedure. We have used a distributed dynamic hashing approach to construct the de Bruijn graph from short-read data. Evaluations using three paired-end datasets show that, our method outperforms previous parallel and distributed assemblers on a CPU cluster system.
  • Keywords
    DNA; biology computing; distributed processing; file organisation; genomics; graph theory; CPU cluster system; DNA sequencing throughput; de Bruijn graph; deep read coverage; distributed dynamic hashing approach; fragment assembly; genome assembly procedure; next-generation sequencing technologies; short-read assemblies; short-read database; Acceleration; Assembly; Bioinformatics; Couplings; Genomics; Sequential analysis; Vectors; De Bruijn graph; Dynamic hashing; Genome assembly;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    TENCON 2013 - 2013 IEEE Region 10 Conference (31194)
  • Conference_Location
    Xi´an
  • ISSN
    2159-3442
  • Print_ISBN
    978-1-4799-2825-5
  • Type

    conf

  • DOI
    10.1109/TENCON.2013.6719008
  • Filename
    6719008