Title :
A Data-Aware Partitioning and Optimization Method for Large-Scale Workflows in Hybrid Computing Environments
Author :
Rubing Duan ; Xiaorong Li
Author_Institution :
Inst. of High Performance Comput., A*STAR, Singapore, Singapore
Abstract :
While hybrid computing environments provide good potential for achieving high performance and low economic cost, it also introduces a broad set of unpredictable overheads especially for running data-intensive applications. This paper describes a novel approach which refines workflow structures and optimizes intermediate data transfers for large-scale scientific workflows containing thousands (or even millions) of tasks. The proposed method includes pre- and post-partitioning of workflows and data-flow optimization. Firstly, it partitions a workflow by identifying the critical path of the task graph. Secondly, it controls the granularity of partitions to reduce the complexity of task graph in order to process large-scale workflows. Thirdly, it optimizes the data-flow based on the scheduling to minimize its communication overheads. Our proposed approach is able to handle complex data flows and significantly reduce data transfer by replacing individual tasks according to data dependencies. We conducted experiments using real applications such as Montage and Broadband, and the results demonstrated the effectiveness of our methods in achieving low execution time with low communication overhead in a hybrid computing environments.
Keywords :
graph theory; grid computing; optimisation; data-aware partitioning; data-flow optimization; hybrid computing environment; large-scale scientific workflow; optimization method; task graph; Algorithm design and analysis; Complexity theory; Data transfer; Optimization; Parallel processing; Partitioning algorithms; Physics;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2013 International Conference on
Conference_Location :
Seoul
DOI :
10.1109/ICPADS.2013.29