Title :
Hierarchical phasers for scalable synchronization and reductions in dynamic parallelism
Author :
Shirako, Jun ; Sarkar, Vivek
Author_Institution :
Dept. of Comput. Sci., Rice Univ., Houston, TX, USA
Abstract :
The phaser construct is a unification of collective and point-to-point synchronization with dynamic parallelism. This construct gives each task the option of synchronizing on a phaser in signal-only/wait-only mode for producer/consumer synchronization or signal-wait mode for barrier synchronization. A phaser accumulator is a reduction construct that works with phasers in a phased setting. Phasers and accumulators support dynamic parallelism i.e., they allow dynamic addition and removal of tasks from the synchronizations and reductions that they support. Past implementations of phasers and phaser accumulators have used a single master task to advance a phaser to the next phase and to perform computations for lazy reductions, while also supporting dynamic parallelism. Though the single master approach provides an effective solution for modest levels of parallelism, it quickly becomes a scalability bottleneck as the number of threads increases. To address this limitation, we propose an approach based on hierarchical phasers for scalable synchronization and hierarchical accumulators for scalable reduction. Our approach also includes tunable initialization parameters that specify the degree and number of tiers for the phaser hierarchy, thereby allowing different values to be chosen for different platforms. Our performance results show significant scalability benefits from our approach. To the best of our knowledge, this is the first approach to support hierarchical synchronization and reductions in the presence of dynamic parallelism.
Keywords :
parallel processing; synchronisation; barrier synchronization; dynamic parallelism reductions; hierarchical phaser accumulator; hierarchical synchronization; point-to-point synchronization; producer-consumer synchronization; scalability bottleneck; scalable synchronization; tunable initialisation parameters; Phasers; barrier synchronization; dynamic parallelism; point-to-point synchronization; reductions;
Conference_Titel :
Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4244-6442-5
DOI :
10.1109/IPDPS.2010.5470414