Author_Institution :
Dept. of Comput. Sci., Georgia State Univ., Atlanta, GA, USA
Abstract :
Cloud computing has emerged rapidly as a growing paradigm of on-demand access to computing, data and software utilities using a usage-based billing model. Many massive data applications including database applications and web searching should be the ideal applications on cloud platforms. However, many legacy scientific computing codes are written in traditional parallel programming languages such as MPI and OpenMP and cannot be executed on these cloud platforms. With the current cloud programming models, complicated scientific computing algorithms cannot be implemented easily and executed efficiently on many cloud platforms. In this talk, I will give a review of different massively parallel computing platforms and compare various computing domains and programming models on these platforms. In particular, I will point out the shortcomings and limitations of current cloud computing programming models for typical scientific computing algorithms, and propose possible solutions. Current cloud models such as MapReduce or Spark and their variants have succeeded in data-parallel applications such as database operations and web searching; however, they are still not effective for applications with a lot of data dependency such as scientific computing applications. We propose several approaches to solving this problem through extension of current programming models, automatic translation from sequential codes to cloud codes, simple API and framework built on current cloud models and traditional models such as MPI, detection of data and task parallelism, and their efficient scheduling. Some preliminary theoretical and experimental results will also be reported in this talk.