• DocumentCode
    139093
  • Title

    A fast sequence assembly method based on compressed data structures

  • Author

    Peifeng Liang ; Yancong Zhang ; Kui Lin ; Jinglu Hu

  • Author_Institution
    Grad. Sch. of Inf. Production & Syst., WASEDA Univ., Fukuoka, Japan
  • fYear
    2014
  • fDate
    26-30 Aug. 2014
  • Firstpage
    326
  • Lastpage
    329
  • Abstract
    Assembling a large genome using next generation sequencing reads requires large computer memory and a long execution time. To reduce these requirements, a memory and time efficient assembler is presented from applying FM-index in JR-Assembler, called FMJ-Assembler, where FM stand for FMR-index derived from the FM-index and BWT and J for jumping extension. The FMJ-Assembler uses expanded FM-index and BWT to compress data of reads to save memory and jumping extension method make it faster in CPU time. An extensive comparison of the FMJ-Assembler with current assemblers shows that the FMJ-Assembler achieves a better or comparable overall assembly quality and requires lower memory use and less CPU time. All these advantages of the FMJ-Assembler indicate that the FMJ-Assembler will be an efficient assembly method in next generation sequencing technology.
  • Keywords
    bioinformatics; data compression; genetics; genomics; BWT; FMJ-Assembler; FMR-index; JR-Assembler; compressed data structures; computer memory; execution time; expanded FM-index; fast sequence assembly method; genome; jumping extension method; next generation sequencing reads; next generation sequencing technology; overall assembly quality; Assembly; Bioinformatics; Data structures; Genomics; Indexes; Memory management; Sequential analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE
  • Conference_Location
    Chicago, IL
  • ISSN
    1557-170X
  • Type

    conf

  • DOI
    10.1109/EMBC.2014.6943595
  • Filename
    6943595