مرکز منطقه ای اطلاع رساني علوم و فناوري - A Performance Analysis of SIMD Algorithms for Monte Carlo Simulations of Nuclear Reactor Cores

DocumentCode :

3200266

Title :

A Performance Analysis of SIMD Algorithms for Monte Carlo Simulations of Nuclear Reactor Cores

Author :

Ozog, David ; Malony, Allen D. ; Siegel, Andrew R.

Author_Institution :

Dept. of Comput. & Inf. Sci., Univ. of Oregon, Eugene, OR, USA

fYear :

2015

fDate :

25-29 May 2015

Firstpage :

733

Lastpage :

742

Abstract :

A primary characteristic of history-based Monte Carlo neutron transport simulation is the application of MIMD-style parallelism: the path of each neutron particle is largely independent of all other particles, so threads of execution perform independent instructions with respect to other threads. This conflicts with the growing trend of HPC vendors exploiting SIMD hardware, which accomplishes better parallelism and more FLOPS per watt. Event-based neutron transport suits vectorization better than history-based transport, but it is difficult to implement and complicates data management and transfer. However, the Intel Xeon Phi architecture supports the familiar ×86 instruction set and memory model, mitigating difficulties in vector zing neutron transport codes. This paper compares the event-based and history-based approaches for exploiting SIMD in Monte Carlo neutron transport simulations. For both algorithms, we analyze performance using the three different execution models provided by the Xeon Phi (offload, native, and symmetric) within the full-featured OpenVMS framework. A representative micro-benchmark of the performance bottleneck computation shows about 10x performance improvement using the event-based method. In an optimized history-based simulation of a full-physics nuclear reactor core in OpenVMS, the MIC shows a calculation rate 1.6x higher than a modern 16-core CPU, 2.5x higher when balancing load between the CPU and 1 MIC, and 4x higher when balancing load between the CPU and 2 Macs. As far as we are aware, our calculation rate per node on a high fidelity benchmark (17, 098 particles/second) is higher than any other Monte Carlo neutron transport application. Furthermore, we attain 95% distributed efficiency when using MPI and up to 512 concurrent MIC devices.

Keywords :

Monte Carlo methods; concurrency control; fission reactors; fusion reactors; instruction sets; neutron transport theory; nuclear engineering computing; parallel processing; software performance evaluation; ×86 instruction set; CPU; HPC vendors; Intel Xeon Phi architecture; MIMD-style parallelism; MPI; Macs; Monte Carlo neutron transport simulations; Monte Carlo simulations; OpenVMS framework; SIMD algorithms; concurrent MIC devices; data management; event-based approach; event-based method; event-based neutron transport suits vectorization; full-physics nuclear reactor core; history-based Monte Carlo neutron transport simulation; history-based approach; memory model; neutron particle; nuclear reactor cores; optimized history-based simulation; performance analysis; performance bottleneck computation; representative microbenchmark; vector zing neutron transport codes; Banking; Benchmark testing; Computational modeling; Inductors; Microwave integrated circuits; Monte Carlo methods; Neutrons; Intel Xeon Phi coprocessor; MIC; Monte Carlo; SIMD; neutron transport; performance; reactor simulation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International

Conference_Location :

Hyderabad

ISSN :

1530-2075

Type :

conf

DOI :

10.1109/IPDPS.2015.105

Filename :

7161560

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3200266