DocumentCode
3331040
Title
Highly scalable barriers for future high-performance computing clusters
Author
Fröning, Holger ; Giese, Alexander ; Montaner, Héctor ; Silla, Federico ; Duato, José
Author_Institution
Univ. of Heidelberg, Heidelberg, Germany
fYear
2011
fDate
18-21 Dec. 2011
Firstpage
1
Lastpage
10
Abstract
Although large scale high performance computing today typically relies on message passing, shared memory can offer significant advantages, as the overhead associated with MPI is completely avoided. In this way, we have developed an FPGA-based Shared Memory Engine that allows to forward memory transactions, like loads and stores, to remote memory locations in large clusters, thus providing a single memory address space. As coherency protocols do not scale with system size we completely avoid a global coherency across the cluster. However, we maintain local coherency domains, thus keeping the cores within one node coherent. In this paper, we show the suitability of our approach by analyzing the performance of barriers, a very common synchronization primitive in parallel programs. Experiments in a real cluster prototype show that our approach allows synchronization among 1024 cores spread over 64 nodes in less than 15us, several times faster than other highly optimized barriers. We show the feasibility of this approach by executing a shared-memory implementation of FFT. Finally, note that this barrier can also be leveraged by MPI applications running on our shared memory architecture for clusters. This ensures the usefulness of this work for applications already written.
Keywords
application program interfaces; distributed shared memory systems; fast Fourier transforms; field programmable gate arrays; memory architecture; message passing; parallel programming; workstation clusters; FFT; FPGA-based shared memory engine; MPI; barrier performance analysis; coherency protocols; future high-performance computing clusters; global coherency; highly scalable barriers; large scale high performance computing; local coherency; memory transactions; message passing; parallel programs; shared memory architecture; single memory address space; synchronization; Computer architecture; Engines; Message passing; Message systems; Protocols; Random access memory; Synchronization; computer communications; computer synchronization; distributed shared memory; high performance networking;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing (HiPC), 2011 18th International Conference on
Conference_Location
Bangalore
Print_ISBN
978-1-4577-1951-6
Electronic_ISBN
978-1-4577-1949-3
Type
conf
DOI
10.1109/HiPC.2011.6152729
Filename
6152729
Link To Document