• DocumentCode
    167525
  • Title

    Improving I/O Performance with Adaptive Data Compression for Big Data Applications

  • Author

    Hongbo Zou ; Yongen Yu ; Wei Tang ; Chen, Hsuanwei Michelle

  • Author_Institution
    Georgia Inst. of Technol., Atlanta, GA, USA
  • fYear
    2014
  • fDate
    19-23 May 2014
  • Firstpage
    1228
  • Lastpage
    1237
  • Abstract
    Increasingly larger scale simulations are generating an unprecedented amount of data. However, the increasing gap between computation and I/O capacity on High End Computing machines makes a severe bottleneck for data analysis. As a solution, in-situ analytics processes output data while simulations are running and before placing data on disk. Data movement between simulation and analytics, however, incurs overheads of in-situ analytics at scale. This paper tries to answer the following question: can we use compression technology to reduce the data movement cost and improve the performance of in-situ analytics for peta-scale applications? In particular, we explore when, where, how to use the compression techniques to reduce data movement cost between simulation and analytics. To find out the best algorithm and place to compress data in given situation, we introduce an adaptive data compression algorithm in this paper. The adaptive compression service is developed and analyzed for the in-situ analytics middleware. Experimental results demonstrate that compression service increases data transition bandwidth and improve the application End-to-End transfer performance.
  • Keywords
    Big Data; data analysis; data compression; Big Data applications; I/O performance; adaptive compression service; adaptive data compression algorithm; adaptive data compression technology; data analysis; data movement cost; data transition bandwidth; high end computing machines; petascale applications; Analytical models; Bandwidth; Compression algorithms; Computational modeling; Data compression; Data models; Data transfer; Big Data; Compression; High-end Computing; I/O Bottlenecks; In-situ Analytics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
  • Conference_Location
    Phoenix, AZ
  • Print_ISBN
    978-1-4799-4117-9
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2014.138
  • Filename
    6969520