Author :
Wottrich, Rodolfo ; Azevedo, Rodolfo ; Araujo, Gabriel
Abstract :
Harnessing the flexibility and scaling features of the cloud can open up opportunities to address some relevant research problems in scientific computing. Nevertheless, cloudbased parallel programming models need to address some relevant issues, namely communication overhead, workload balance and fault tolerance. Programming models, which work well in multicore machines (e.g. OpenMP), still do not offer a smooth transition path to the cloud, which could bridge the gap from a local prototype execution to a cloud production run. On the other hand, cloud-based execution models, like MapReduce, are very effective in performing regular fault-tolerant computation on large distributed workloads. In this paper we propose OpenMR, an execution model based on OpenMP semantics and MapReduce, which eases the task of programming parallel applications in the cloud. Specifically, this work addresses the problem of performing loop parallelization in a distributed environment, through the mapping of loop iterations to MapReduce nodes. By doing so, the cloud programming interface becomes the programming language itself, freeing the developer from the task of distributing workload and data, while enabling fault-tolerance and workload balancing. To assess the validity of the proposal, we modified benchmarks from the SPEC OMP2012 and Rodinia suites to fit the proposed model, developed I/O-bound synthetic benchmarks and validated them using Amazon AWS services. We compare the results to the execution of OpenMP in an SMP architecture, and show that OpenMR exhibits good scalability under a simple programming model.
Keywords :
cloud computing; fault tolerant computing; natural sciences computing; parallel programming; programming languages; Amazon AWS services; MapReduce nodes; MapReduce runtime; OpenMP semantics; OpenMR; Rodinia suites; SMP architecture; SPEC OMP2012; cloud production run; cloud programming interface; cloud-based OpenMP parallelization; cloud-based execution models; cloud-based parallel programming models; communication overhead; distributed workloads; fault tolerance; fault-tolerant computation; loop parallelization; multicore machines; programming language; programming parallel applications; scaling features; scientific computing; synthetic benchmarks; workload balance; workload balancing; Computational modeling; Elasticity; Electronics packaging; Fault tolerance; Fault tolerant systems; Programming; Runtime; MapReduce; OpenMP; cloud computing; parallel programming;