DocumentCode :
451282
Title :
Scalable NIC-based Reduction on Large-scale Clusters
Author :
Moody, Adam ; Fernandez, Juan ; Petrini, Fabrizio ; Panda, Dhabaleswar K.
Author_Institution :
The Ohio State University, Columbus
fYear :
2003
fDate :
15-21 Nov. 2003
Firstpage :
59
Lastpage :
59
Abstract :
Many parallel algorithms require efficient reduction collectives. In response, researchers have designed algorithms considering a range of parameters including data size, system size, and communication characteristics. Throughout this past work, however, processing was limited to the host CPU. Today, modern Network Interface Cards (NICs) sport programmable processors with substantial memory, and thus introduce a fresh variable into the equation. In this paper, we investigate this new option in the context of large-scale clusters. Through experiments on the 960-node, 1920-processor ASCI Linux Cluster (ALC) at Lawrence Livermore National Laboratory, we show that NIC-based reductions outperform host-based algorithms in terms of reduced latency and increased consistency. In particular, in the largest configuration tested - 1812 processors - our NIC-based algorithm summed single-element vectors of 32-bit integers and 64-bit floating-point numbers in 73 µs and 118 µs, respectively. These results represent respective improvements of 121% and 39% over the production-level MPI library.
Keywords :
Algorithm design and analysis; Automatic logic units; Clustering algorithms; Context; Equations; Laboratories; Large-scale systems; Linux; Network interfaces; Parallel algorithms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Supercomputing, 2003 ACM/IEEE Conference
Print_ISBN :
1-58113-695-1
Type :
conf
DOI :
10.1109/SC.2003.10051
Filename :
1592962
Link To Document :
بازگشت