DocumentCode :
598619
Title :
Optimization of geometric multigrid for emerging multi- and manycore processors
Author :
Williams, S. ; Kalamkar, Dhiraj D. ; Singh, Ashutosh ; Deshpande, Aditya M. ; Van Straalen, Brian ; Smelyanskiy, Mikhail ; Almgren, Ann ; Dubey, Pradeep ; Shalf, J. ; Oliker, Leonid
fYear :
2012
fDate :
10-16 Nov. 2012
Firstpage :
1
Lastpage :
11
Abstract :
Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear systems used in a number of different application areas. In this paper, we explore optimization techniques for geometric multigrid on existing and emerging multicore systems including the Opteron-based Cray XE6, Intel® Xeon® E5-2670 and X5550 processor-based Infiniband clusters, as well as the new Intel® Xeon Phi coprocessor (Knights Corner). Our work examines a variety of novel techniques including communication-aggregation, threaded wavefront-based DRAM communication-avoiding, dynamic threading decisions, SIMDization, and fusion of operators. We quantify performance through each phase of the V-cycle for both single-node and distributed-memory experiments and provide detailed analysis for each class of optimization. Results show our optimizations yield significant speedups across a variety of subdomain sizes while simultaneously demonstrating the potential of multi- and manycore processors to dramatically accelerate single-node performance. However, our analysis also indicates that improvements in networks and communication will be essential to reap the potential of manycore processors in large-scale multigrid calculations.
Keywords :
DRAM chips; iterative methods; microprocessor chips; multiprocessing systems; optimisation; Opteron-based Cray XE6; Phi coprocessor; SIMDization; X5550 processor-based Infiniband cluster; Xeon E5-2670; communication-aggregation; dynamic threading; geometric multigrid optimization; iterative solver; large-scale multigrid calculation; linear system; manycore processor; multicore processor; threaded wavefront-based DRAM; Bandwidth; Computer architecture; Instruction sets; Optimization; Parallel processing; Random access memory; Geometric Multigrid; Knights Corner; OpenMP; Xeon Phi; auto-tuning; communication-avoiding; multicore;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
Conference_Location :
Salt Lake City, UT
ISSN :
2167-4329
Print_ISBN :
978-1-4673-0805-2
Type :
conf
DOI :
10.1109/SC.2012.85
Filename :
6468534
Link To Document :
بازگشت