• DocumentCode
    3247976
  • Title

    RI2N: High-bandwidth and fault-tolerant network with multi-link Ethernet for PC clusters

  • Author

    Miura, Shinichi ; Okamoto, Takayuki ; Boku, Taisuke ; Hanawa, Toshihiro ; Sato, Mitsuhisa

  • Author_Institution
    Center for Comput. Sci., Univ. of Tsukuba, Tsukuba
  • fYear
    2008
  • fDate
    Sept. 29 2008-Oct. 1 2008
  • Firstpage
    274
  • Lastpage
    279
  • Abstract
    Although recent high-end interconnection network devices and switches provide a high performance/cost ratio, most of the small to medium sized PC clusters are still built on the commodity network, Ethernet. To enhance performance on commonly used gigabit Ethernet networks, link aggregation or binding technology is used. Currently, a Linux kernel is equipped with a software solution named linux channel bonding (LCB), which is based on IEEE802.3ad Link Aggregation technology. However, standard LCB has the problem of mismatching with the commonly used TCP protocol, which consequently implies several problems of both large latency and instability on bandwidth improvement. The fault-tolerant feature is also supported, but the usability is not sufficient. We have developed a new implementation similar to LCB named RI2N/DRV (redundant interconnection with inexpensive network with driver) for use on a gigabit Ethernet with a complete software stack that is very compatible with the TCP protocol. Our algorithm suppresses unnecessary ACK packets and retransmission of packets even in imbalanced network traffic and link failures on multiple links. It provides both high-bandwidth and fault-tolerant communication on multi-link gigabit Ethernet. We confirmed that this system improves the performance and reliability of the network, and our system can be applied to ordinary UNIX services such as NFS, without any modification of other modules.
  • Keywords
    Linux; fault tolerant computing; transport protocols; wireless LAN; workstation clusters; IEEE802.3ad link aggregation technology; Linux channel bonding; Linux kernel; PC clusters; TCP protocol; UNIX; fault-tolerant network; gigabit Ethernet; high-bandwidth network; high-end interconnection network; multilink Ethernet; redundant interconnection; reliability; Bonding; Costs; Delay; Ethernet networks; Fault tolerance; Kernel; Linux; Multiprocessor interconnection networks; Protocols; Switches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing, 2008 IEEE International Conference on
  • Conference_Location
    Tsukuba
  • ISSN
    1552-5244
  • Print_ISBN
    978-1-4244-2639-3
  • Electronic_ISBN
    1552-5244
  • Type

    conf

  • DOI
    10.1109/CLUSTR.2008.4663781
  • Filename
    4663781