DocumentCode
2293882
Title
Application Performance Tuning for Clusters with ccNUMA Nodes
Author
Kayi, Abdullah ; Kornkven, Edward ; El-Ghazawi, Tarek ; Newby, Greg
Author_Institution
Dept. of Electr. & Comput. Eng., George Washington Univ., Washington, DC
fYear
2008
fDate
16-18 July 2008
Firstpage
245
Lastpage
252
Abstract
With the increasing trend of putting more cores inside a single chip, more clusters adapt multicore multiprocessor nodes for high-performance computing (HPC). Cache coherent non-uniform memory access architectures (ccNUMA) are becoming an increasingly popular choice for such systems. In this paper, application performance analysis is provided using a 2312 Opteron cores system based on Sun Fire servers. Performance bottlenecks are identified and some potential solutions are proposed. With the proposed performance tunings, up to 30% application performance improvement was observed. In addition, provided experimental analysis can be utilized by HPC application developers in order to better understand clusters with ccNUMA nodes and also as a guideline for the usage of such architectures for scientific computing.
Keywords
cache storage; memory architecture; multiprocessing systems; 2312 Opteron cores system; application performance tuning; cache coherent nonuniform memory access architecture; ccNUMA nodes; high-performance computing; multicore multiprocessor nodes; Arctic; Benchmark testing; Computer architecture; Fires; High performance computing; Memory architecture; Multicore processing; Scalability; Sockets; Sun; application performance; ccNUMA; cpu affinity; high-performance computing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Science and Engineering, 2008. CSE '08. 11th IEEE International Conference on
Conference_Location
Sao Paulo
Print_ISBN
978-0-7695-3193-9
Type
conf
DOI
10.1109/CSE.2008.46
Filename
4578239
Link To Document