DocumentCode
3678454
Title
Fault-Tolerant Routing for Exascale Supercomputer: The BXI Routing Architecture
Author
Vignéras;Jean-Noël
fYear
2015
Firstpage
793
Lastpage
800
Abstract
BXI, Bull eXascale Interconnect, is the new inter-connection network developed by Atos for High Performance Computing. It has been designed to meet the requirements of exascale supercomputers. At such scale, faults have to be expected and dealt with transparently so that applications remain unaffected by them. BXI features various mechanisms for this purpose, one of which is the BXI routing component presented in this paper. The BXI routing module computes the full routing tables for a 64k nodes fat-tree in a few minutes. But with partial re-computation it can withstand numerous inter-router link failures without any noticeable impact on running applications.
Keywords
"Routing","Switches","Topology","System recovery","Ports (Computers)","Algorithm design and analysis","Computer architecture"
Publisher
ieee
Conference_Titel
Cluster Computing (CLUSTER), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/CLUSTER.2015.135
Filename
7307684
Link To Document