DocumentCode :
2422865
Title :
Network Endpoints for Clusters of SMPs
Author :
Tanase, Gabriel ; Almasi, Gheorghe ; Xue, Hanhong ; Archer, Charles
fYear :
2012
fDate :
24-26 Oct. 2012
Firstpage :
27
Lastpage :
34
Abstract :
Modern large scale parallel machines feature an increasingly deep hierarchy of interconnections. Individual processing cores employ simultaneous multithreading (SMT) to better exploit functional units, multiple coherent processors are collocated in a node to better exploit links to cache, memory and network (SMP), and multiple nodes are interconnected by specialized low latency/high speed networks. Current trends indicate ever wider SMP nodes in the future. To service these nodes, modern high performance network devices (including Infiniband and all of IBM´s recent offerings) offer the ability to sub-divide the network devices´ resources among the processing threads. System software, however, lags in exploiting these capabilities, leaving users of e.g., MPI[14], UPC[19] in a bind, requiring complex and fragile workarounds in user programs. In this paper we discuss our implementation of endpoints, the software paradigm central to the IBM PAMI messaging library [3]. A PAMI endpoint is an expression in software of a slice of the network device. System software can service endpoints without serializing the many threads on an SMP by forcing them through a critical section. In the paper we describe the basic guarantees offered by PAMI to the programmer, and how these can be used to enable efficient implementations of high level libraries and programming languages like UPC. We evaluate the efficiency of our implementation on a novel P7IHsystem with up to 4096 cores, running micro benchmarks designed to find performance deficiencies in the endpoints implementation of both point-to-point and collective functions.
Keywords :
cache storage; libraries; multi-threading; parallel machines; shared memory systems; IBM PAMI messaging library; P7IHsystem; SMP cluster; SMT; UPC; cache; collective functions; high level libraries; high performance network devices; low latency-high speed networks; multiple coherent processors; network endpoints; parallel machines; point-to-point functions; programming languages; shared memory nodes; simultaneous multithreading; Context; Geometry; Hardware; Instruction sets; Libraries; Message systems; MPI; endpoint; model; network; programming;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
Conference_Location :
New York, NY
ISSN :
1550-6533
Print_ISBN :
978-1-4673-4790-7
Type :
conf
DOI :
10.1109/SBAC-PAD.2012.15
Filename :
6374768
Link To Document :
بازگشت