• DocumentCode
    738164
  • Title

    Dynamic On-the-Fly Minimum Cost Benchmarking for Storing Generated Scientific Datasets in the Cloud

  • Author

    Yuan, Dong ; Liu, Xiao ; Yang, Yun

  • Author_Institution
    , Swinburne University of Technology, Hawthorn, Melbourne, Australia
  • Volume
    64
  • Issue
    10
  • fYear
    2015
  • Firstpage
    2781
  • Lastpage
    2795
  • Abstract
    Massive computation power and storage capacity of cloud computing systems enable users to either store large generated scientific datasets in the cloud or delete and then regenerate them whenever reused. Due to the pay-as-you-go model, the more datasets we store, the more storage cost we need to pay, alternatively, we can delete some generated datasets to save the storage cost but more computation cost is incurred for regeneration whenever the datasets are reused. Hence, there should exist a trade-off between computation and storage in the cloud, where different storage strategies lead to different total costs. The minimum cost, which reflects the best trade-off, is an important benchmark for evaluating the cost-effectiveness of different storage strategies. However, the current benchmarking approach is neither efficient nor practical to be applied on the fly at runtime. In this paper, we propose a novel Partitioned Solution Space based approach with efficient algorithms for dynamic yet practical on-the-fly minimum cost benchmarking of storing generated datasets in the cloud. In this approach, we pre-calculate all the possible minimum cost storage strategies and save them in different partitioned solution spaces. The minimum cost storage strategy represents the minimum cost benchmark, and whenever the datasets storage cost changes at runtime in the cloud (e.g. new datasets are generated and/or existing datasets’ usage frequencies are changed), our algorithms can efficiently retrieve the current minimum cost storage strategy from the partitioned solution space and update the benchmark. By dynamically keeping the benchmark updated, our approach can be practically utilised on the fly at runtime in the cloud, based on which the minimum cost benchmark can be either proactively reported or instantly responded upon request. Case studies and experimental results based on Amazon cloud show the efficiency, scalability and practicality of our approach.
  • Keywords
    Benchmark testing; Cloud computing; Computational modeling; Heuristic algorithms; Partitioning algorithms; Runtime; Thyristors; Cloud computing; cloud computing; datasets storage and regeneration; minimum cost benchmarking; scientific applications;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2015.2389801
  • Filename
    7005438