DocumentCode :
633745
Title :
Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer
Author :
Xingfu Wu ; Taylor, Valerie
Author_Institution :
Dept. of Comput. Sci. & Eng., Texas A&M Univ., College Station, TX, USA
fYear :
2013
fDate :
1-3 July 2013
Firstpage :
303
Lastpage :
309
Abstract :
In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used.
Keywords :
application program interfaces; floating point arithmetic; message passing; multi-threading; multiprocessing systems; natural sciences computing; parallel machines; software libraries; Argonne National laboratory; FPU; IPC; OpenMP thread; floating point unit; hybrid MPI; hybrid scientific application; large-scale multithreaded BlueGene-Q supercomputer; performance characteristics; trace library; Benchmark testing; Instruction sets; Laboratories; Libraries; Message systems; Performance analysis; Supercomputers; BlueGene/Q; Performance analysis; hybrid MPI/OpenMP; multithreaded;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2013 14th ACIS International Conference on
Conference_Location :
Honolulu, HI
Type :
conf
DOI :
10.1109/SNPD.2013.81
Filename :
6598481
Link To Document :
بازگشت