• DocumentCode
    1877525
  • Title

    GAGM: Genome assembly on GPU using mate pairs

  • Author

    Jain, Abhishek ; Garg, Adesh ; Paul, Kolin

  • Author_Institution
    Dept. of Comput. Sci. & Eng., IIT Delhi, New Delhi, India
  • fYear
    2013
  • fDate
    18-21 Dec. 2013
  • Firstpage
    176
  • Lastpage
    185
  • Abstract
    Genome fragment assembly has long been a time and computation intensive problem in the field of bioinformatics. Many parallel assemblers have been proposed to accelerate the process but there hasn´t been any effective approach proposed for GPUs. Also with the increasing power of GPUs, applications from various research fields are being parallelized to take advantage of the massive number of “cores” available in GPUs. In this paper we present the design and development of a GPU based assembler (GAGM) for sequence assembly using Nvidia´s GPUs with the CUDA programming model. Our assembler utilizes the mate pair reads produced by the current NGS technologies to build paired de Bruijn graph. Every paired read is broken into paired k-mers and l-mers. Every paired k-mer represents a vertex and paired l-mers are mapped as edges. Contigs are formed by grouping the regions of graph which can be unambiguously connected. We present parallel algorithms for k - mer extraction, paired de Bruijn graph construction and grouping of edges. We have benchmarked GAGM on four bacterial genomes. Our results show that the design on GPU is effective in terms of time as well as the quality of assembly produced.
  • Keywords
    biocomputing; graph theory; graphics processing units; parallel algorithms; parallel architectures; program assemblers; CUDA programming model; GAGM; GPU based assembler; NGS technologies; bacterial genomes; bioinformatics; edge grouping; genome fragment assembly; k-mer extraction; mate pairs; paired de Bruijn graph construction; paired k-mers; paired l-mers; paired read; parallel algorithms; parallel assemblers; sequence assembly; vertex; Benchmark testing; Bioinformatics; DNA; Encoding; Genomics; Graphics processing units; GPU; bioinformatics; genome assembly; parallel processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing (HiPC), 2013 20th International Conference on
  • Conference_Location
    Bangalore
  • Type

    conf

  • DOI
    10.1109/HiPC.2013.6799107
  • Filename
    6799107