DocumentCode :
3664144
Title :
ProSteal: A Proactive Work Stealer for Bulk Synchronous Tasks Distributed on a Cluster of Heterogeneous Machines with Multiple Accelerators
Author :
Tarun Beri;Sorav Bansal;Subodh Kumar
Author_Institution :
Indian Inst. of Technol. Delhi, New Delhi, India
fYear :
2015
fDate :
5/1/2015 12:00:00 AM
Firstpage :
17
Lastpage :
26
Abstract :
Work stealing is an effective load balancing technique in shared memory parallel programming. However, in a distributed setup researchers have pointed out difficulties in termination detection and in sustaining a healthy steal success rate. Keeping unsuccessful steal attempts to a minimum is especially important with many-core accelerators (having specialized engines for data copy-in and copy-out), as this not only ensures that the accelerators (or GPUs) are busy but these copy engines are also working in parallel. A steal attempt by a GPU may dry up one or more stages in this pipeline of copy and execution engines. In a cluster environment, similar problem happens with the pipeline that overlaps remote data transfers with local computations. In this paper, we study the loss in compute-communication overlap as a result of work stealing. We also present a proactive stealing approach that recovers the lost overlap by re-gaining it at the stealer´s end. We evaluate our technique over Unicorn, a framework that decomposes bulk synchronous computations over a cluster of nodes equipped with multiple CPUs and GPUs. As compared to conventional random victim selection with half steal strategy, our approach achieves a performance gain of 3.19x while convolving a 4 GB image with a 31*31 filter and 1.34x while multiplying two square matrices of one billion elements each over a 10-node cluster with 120 CPUs and 20 GPUs.
Keywords :
"Pipelines","Graphics processing units","Data transfer","Performance evaluation","Convolution","Load management","Matrix decomposition"
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International
Type :
conf
DOI :
10.1109/IPDPSW.2015.7
Filename :
7284286
Link To Document :
بازگشت