Title :
RI2N/UDP: High bandwidth and fault-tolerant network for a PC-cluster based on multi-link Ethernet
Author :
Okamoto, Takayuki ; Miura, Shin Ichi ; Boku, Taisuke ; Sato, Mitsuhisa ; Takahashi, Daisuke
Author_Institution :
Graduate Sch. of Syst. & Inf. Eng., Tsukuba Univ.
Abstract :
PC-clusters with high performance/cost ratio have been one of the typical platforms for high performance computing. To lower costs, Gigabit Ethernet is often used for intercommunication networks. However, the reliability of Ethernet is limited due to hardware failures and tentative errors in the network switches. To solve this problem, we propose an interconnection network system based on multi-link Ethernet named RI2N. In this paper, we developed a user level implementation of RI2N using UDP/IP that is called RI2N/UDP. When this new system was evaluated for performance and fault tolerance, the bandwidth on a 2-link Gigabit Ethernet was 246 MB/s, and the system could remain active during network link failure to provide high system reliability.
Keywords :
computer network reliability; fault tolerant computing; transport protocols; workstation clusters; 2-link gigabit Ethernet; PC-cluster; RI2N/UDP; fault-tolerant network; interconnection network system; multilink Ethernet; Bandwidth; Costs; Ethernet networks; Fault tolerance; Fault tolerant systems; Hardware; High performance computing; Multiprocessor interconnection networks; Reliability; Switches;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International
Conference_Location :
Long Beach, CA
Print_ISBN :
1-4244-0910-1
Electronic_ISBN :
1-4244-0910-1
DOI :
10.1109/IPDPS.2007.370477