• DocumentCode
    3714586
  • Title

    NGS read data compression using parallel computing algorithm

  • Author

    Biji C.L.;Achuthsankar S. Nair; Arun P.R;Jojo George

  • Author_Institution
    Dept. of Computational Biology and Bioinformatics, University of Kerala, Thiruvananthapuram, Pin-695581, India
  • fYear
    2015
  • Firstpage
    1456
  • Lastpage
    1460
  • Abstract
    Analysing and storing the high-throughput sequencing data from next generation sequencing technologies is facing great bottlenecks, hampered by the big data emerging in Terabyte range from the Next Generation Sequencing (NGS) machine. The present trend demands more sophisticated parallel computing algorithms for managing the data explosion. We propose a parallel implementation of MFCompress algorithm using message passing interface model. In the NGS Read Compression using parallel computing algorithm, the input file is split into different number of parts based on the number of nodes and each processor uses the multiple finite-context models for compression. For testing the proposed approach, we have selected read dataset from the range 50MB to 10 GB. The algorithm reported a best compression of 0.33 bpb and a speedup ratio of 6 with an average of 23 times disk space reduction.
  • Keywords
    "Bioinformatics","Genomics","DNA","Encoding","Computational modeling","Europe","Random access memory"
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BIBM.2015.7359890
  • Filename
    7359890