DocumentCode
3247976
Title
RI2N: High-bandwidth and fault-tolerant network with multi-link Ethernet for PC clusters
Author
Miura, Shinichi ; Okamoto, Takayuki ; Boku, Taisuke ; Hanawa, Toshihiro ; Sato, Mitsuhisa
Author_Institution
Center for Comput. Sci., Univ. of Tsukuba, Tsukuba
fYear
2008
fDate
Sept. 29 2008-Oct. 1 2008
Firstpage
274
Lastpage
279
Abstract
Although recent high-end interconnection network devices and switches provide a high performance/cost ratio, most of the small to medium sized PC clusters are still built on the commodity network, Ethernet. To enhance performance on commonly used gigabit Ethernet networks, link aggregation or binding technology is used. Currently, a Linux kernel is equipped with a software solution named linux channel bonding (LCB), which is based on IEEE802.3ad Link Aggregation technology. However, standard LCB has the problem of mismatching with the commonly used TCP protocol, which consequently implies several problems of both large latency and instability on bandwidth improvement. The fault-tolerant feature is also supported, but the usability is not sufficient. We have developed a new implementation similar to LCB named RI2N/DRV (redundant interconnection with inexpensive network with driver) for use on a gigabit Ethernet with a complete software stack that is very compatible with the TCP protocol. Our algorithm suppresses unnecessary ACK packets and retransmission of packets even in imbalanced network traffic and link failures on multiple links. It provides both high-bandwidth and fault-tolerant communication on multi-link gigabit Ethernet. We confirmed that this system improves the performance and reliability of the network, and our system can be applied to ordinary UNIX services such as NFS, without any modification of other modules.
Keywords
Linux; fault tolerant computing; transport protocols; wireless LAN; workstation clusters; IEEE802.3ad link aggregation technology; Linux channel bonding; Linux kernel; PC clusters; TCP protocol; UNIX; fault-tolerant network; gigabit Ethernet; high-bandwidth network; high-end interconnection network; multilink Ethernet; redundant interconnection; reliability; Bonding; Costs; Delay; Ethernet networks; Fault tolerance; Kernel; Linux; Multiprocessor interconnection networks; Protocols; Switches;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing, 2008 IEEE International Conference on
Conference_Location
Tsukuba
ISSN
1552-5244
Print_ISBN
978-1-4244-2639-3
Electronic_ISBN
1552-5244
Type
conf
DOI
10.1109/CLUSTR.2008.4663781
Filename
4663781
Link To Document