Title :
s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid
Author :
Williams, S. ; Lijewski, Mike ; Almgren, Ann ; Van Straalen, Brian ; Carson, Erin ; Knight, Nicholas ; Demmel, J.
Abstract :
Geometric multigrid solvers within adaptive mesh refinement (AMR) applications often reach a point where further coarsening of the grid becomes impractical as individual sub domain sizes approach unity. At this point the most common solution is to use a bottom solver, such as BiCGStab, to reduce the residual by a fixed factor at the coarsest level. Each iteration of BiCGStab requires multiple global reductions (MPI collectives). As the number of BiCGStab iterations required for convergence grows with problem size, and the time for each collective operation increases with machine scale, bottom solves in large-scale applications can constitute a significant fraction of the overall multigrid solve time. In this paper, we implement, evaluate, and optimize a communication-avoiding s-step formulation of BiCGStab (CABiCGStab for short) as a high-performance, distributed-memory bottom solver for geometric multigrid solvers. This is the first time s-step Krylov subspace methods have been leveraged to improve multigrid bottom solver performance. We use a synthetic benchmark for detailed analysis and integrate the best implementation into BoxLib in order to evaluate the benefit of a s-step Krylov subspace method on the multigrid solves found in the applications LMC and Nyx on up to 32,768 cores on the Cray XE6 at NERSC. Overall, we see bottom solver improvements of up to 4.2x on synthetic problems and up to 2.7x in real applications. This results in as much as a 1.5x improvement in solver performance in real applications.
Keywords :
differential equations; geometry; iterative methods; AMR applications; BoxLib; CABiCGStab; Cray XE6; LMC; MPI collectives; NERSC; Nyx; adaptive mesh refinement applications; bottom solvers; communication-avoiding s-step formulation of BiCGStab; geometric multigrid; large-scale applications; machine scale; multiple global reductions; s-step Krylov subspace methods; synthetic benchmark; Benchmark testing; Convergence; Electric breakdown; Mathematical model; Scalability; Three-dimensional displays; Vectors; BiCGStab; Communication-avoiding; Multigrid;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-3799-8
DOI :
10.1109/IPDPS.2014.119