• DocumentCode
    3678454
  • Title

    Fault-Tolerant Routing for Exascale Supercomputer: The BXI Routing Architecture

  • Author

    Vignéras;Jean-Noël

  • fYear
    2015
  • Firstpage
    793
  • Lastpage
    800
  • Abstract
    BXI, Bull eXascale Interconnect, is the new inter-connection network developed by Atos for High Performance Computing. It has been designed to meet the requirements of exascale supercomputers. At such scale, faults have to be expected and dealt with transparently so that applications remain unaffected by them. BXI features various mechanisms for this purpose, one of which is the BXI routing component presented in this paper. The BXI routing module computes the full routing tables for a 64k nodes fat-tree in a few minutes. But with partial re-computation it can withstand numerous inter-router link failures without any noticeable impact on running applications.
  • Keywords
    "Routing","Switches","Topology","System recovery","Ports (Computers)","Algorithm design and analysis","Computer architecture"
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2015.135
  • Filename
    7307684