Title :
RI2N/DRV: Multi-link ethernet for high-bandwidth and fault-tolerant network on PC clusters
Author :
Miura, Shun ; Hanawa, Toshihiro ; Yonemoto, T. ; Boku, Taisuke ; Sato, Mitsuhisa
Author_Institution :
Center for Comput. Sci., Univ. of Tsukuba, Ibaraki, Japan
Abstract :
Although recent high-end interconnection network devices and switches provide a high performance to cost ratio, most of the small to medium sized PC clusters are still built on the commodity network, Ethernet. To enhance performance on commonly used Gigabit Ethernet networks, link aggregation or binding technology is used. Currently, Linux kernels are equipped with software named Linux Channel Bonding (LCB), which is based IEEE802.3ad link aggregation technology. However, standard LCB has the disadvantage of mismatch with the TCP protocol; consequently, both large latency and bandwidth instability can occur. Fault-tolerance feature is supported by LCB, but the usability is not sufficient. We developed a new implementation similar to LCB named Redundant Interconnection with Inexpensive Network with Driver (RI2N/DRV) for use on Gigabit Ethernet. RI2N/DRV has a complete software stack that is very suitable for TCP, an upper layer protocol. Our algorithm suppresses unnecessary ACK packets and retransmission of packets, even in imbalanced network traffic and link failures on multiple links. It provides both high-bandwidth and fault-tolerant communication on multi-link Gigabit Ethernet. We confirmed that this system improves the performance and reliability of the network, and our system can be applied to ordinary UNIX services such as network file system (NFS), without any modification of other modules.
Keywords :
Linux; bandwidth allocation; fault tolerance; operating system kernels; telecommunication network reliability; telecommunication traffic; transport protocols; workstation clusters; ACK packet; Gigabit Ethernet network; IEEE802.3ad link aggregation technology; Linux Channel Bonding; Linux kernel; PC cluster; RI2N/DRV multilink Ethernet; TCP protocol; UNIX service; fault-tolerant network; high-bandwidth network; high-end interconnection network device; imbalanced network traffic; network link failure; Bonding; Costs; Delay; Ethernet networks; Fault tolerance; Kernel; Linux; Multiprocessor interconnection networks; Protocols; Switches;
Conference_Titel :
Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
Conference_Location :
Rome
Print_ISBN :
978-1-4244-3751-1
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2009.5160894