Title :
Quantum Chemical Many-Body Theory on Heterogeneous Nodes
Author :
DePrince, A. Eugene ; Hammond, Jeff R.
Abstract :
The iterative solution of the coupled-cluster with single and double excitations (CCSD) equations is a very time-consuming component of the ``gold standard´´ in quantum chemistry, the CCSD(T) method. In an effort to accelerate accurate quantum mechanical calculations, we explore two implementation strategies for the iterative solution of the CC equations on graphics procesing units (GPUs). We consider a communication-avoiding algorithm for the spin-free coupled cluster doubles (CCD) equations followed by a low-storage algorithm for the spin-free CCSD equations. In the communication-avoiding algorithm, the entire iterative procedure for the CCD method is performed on the GPU, resulting in accelerations of a factor of 4-5 relative to the pure CPU algorithm. The low-storage CCSD algorithm requires that a minimum of 4o2v2+2ov elements be stored on the device, where o and v represent the number of orbitals occupied and unoccupied in the reference configuration, respectively. The algorithm masks the transfer time for copying large amounts of data to the GPU by overlapping GPU and CPU computations. The per-iteration costs of this hybrid GPU/CPU algorithm are up to 4.06 times less than those of the pure CPU algorithm and up to 10.63 times less than those of the CCSD implementation found in the Molpro electronic structure package. These results provide insight into how to organize communication and computation as to maximize utilization of a GPU and multicore CPU at the same time.
Keywords :
computational complexity; computer graphic equipment; iterative methods; multiprocessing systems; quantum computing; Molpro electronic structure package; communication-avoiding algorithm; coupled-cluster single excitations equations; graphics procesing units; heterogeneous nodes; iterative procedure; low-storage algorithm; multicore CPU; quantum chemical many-body theory; quantum mechanical calculations; spin-free coupled cluster doubles equations; Buffer storage; Charge coupled devices; Equations; Graphics processing unit; Hardware; Multicore processing; Tensile stress; CUDA; coupled-cluster; multicore; quantum chemistry;
Conference_Titel :
Application Accelerators in High-Performance Computing (SAAHPC), 2011 Symposium on
Conference_Location :
Knoxville, TN
Print_ISBN :
978-1-4577-0635-6
Electronic_ISBN :
978-0-7695-4448-9
DOI :
10.1109/SAAHPC.2011.28