Title :
QMP-MVIA: a message passing system for Linux clusters with gigabit Ethernet mesh connections
Author :
Chen, Jie ; Watson, William ; Edwards, Robert ; Mao, Weizhen
Author_Institution :
Jefferson Lab, Coll. of William & Mary, Williamsburg, VA, USA
Abstract :
Recent progress in performance coupled with a decline in price for copper-based gigabit Ethernet (GigE) interconnects makes them an attractive alternative to expensive high speed network interconnects (NIC) when constructing Linux clusters. However traditional message passing systems based on TCP for GigE interconnects cannot fully utilize the raw performance of today´s GigE interconnects due to the overhead of kernel involvement and multiple memory copies during sending and receiving messages. The overhead is more evident in the case of mesh connected Linux clusters using multiple GigE interconnects in a single host. We present a general message passing system called QMP-MVIA (QCD Message Passing over M-VIA) for Linux clusters with mesh connections using GigE interconnects. In particular, we evaluate and compare the performance characteristics of TCP and M-VIA (an implementation of the VIA specification) software for a mesh communication architecture to demonstrate the feasibility of using M-VIA as a point-to-point communication software, on which QMP-MVIA is based. Furthermore, we illustrate the design and implementation of QMP-MVIA for mesh connected Linux clusters with emphasis on both point-to-point and collective communications, and demonstrate that QMP-MVIA message passing system using GigE interconnects achieves bandwidth and latency that are not only better than systems based on TCP but also compare favorably to systems using some of the specialized high speed interconnects in a switched architecture at much lower cost.
Keywords :
LAN interconnection; Linux; computer communications software; message passing; network interfaces; transport protocols; workstation clusters; GigE interconnects; Linux clusters; M-VIA software; NIC; QCD Message Passing over M-VIA; QMP-MVIA; TCP; VIA specification; collective communications; copper-based gigabit Ethernet interconnects; gigabit Ethernet mesh connections; high speed interconnects; high speed network interconnects; kernel involvement; mesh communication architecture; message passing system; point-to-point communication software; switched architecture; Bandwidth; Communication switching; Computer architecture; Delay; Ethernet networks; High-speed networks; Kernel; Linux; Message passing; Software performance;
Conference_Titel :
Cluster Computing, 2004 IEEE International Conference on
Print_ISBN :
0-7803-8694-9
DOI :
10.1109/CLUSTR.2004.1392651