• DocumentCode
    168587
  • Title

    Improving I/O Throughput of Scientific Applications Using Transparent Parallel Compression

  • Author

    Bicer, Tekin ; Jian Yin ; Agrawal, Gagan

  • Author_Institution
    Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
  • fYear
    2014
  • fDate
    26-29 May 2014
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Increasing number of cores in parallel computer systems are allowing scientific simulations to be executed with increasing spatial and temporal granularity. However, this also implies that increasing larger-sized datasets need to be output, stored, managed, and then visualized and/or analyzed using a variety of methods. In examining the possibility of using compression to accelerate all of these steps, we focus on two important questions: "Can compression help save time when data is output from, or input into, a parallel program?", and "How can a scientist\´s effort in using compression with a parallel program be minimized?". We focus on Pnet CDF, and show how transparent compression can be supported, thus allowing an existing simulation program to start outputting and storing data in a compressed fashion, and similarly, allow a data analysis application to read compressed data. We address challenges in supporting compression when parallel writes are being performed. In our experiments, we first analyze the effects of using compression with micro benchmarks, and then, continue our evaluation using a scientific simulation application, and two data analysis applications. While we obtain up to a factor of 2 improvement in performance for micro benchmarks, the execution time of simulation application is improved up to 22%, and the maximum speedup of data analysis applications is 1.83(with an average speedup of 1.36).
  • Keywords
    data analysis; data compression; parallel programming; Pnet CDF; data analysis applications; input-output throughput; parallel computer systems; parallel program; spatial granularity; temporal granularity; transparent parallel compression; Analytical models; Compression algorithms; Computational modeling; Data models; Data visualization; Libraries; Writing; PnetCDF; compression; scientific data management;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/CCGrid.2014.112
  • Filename
    6846435