• DocumentCode
    2502908
  • Title

    Dynamic Fault Tolerance with Misrouting in Fat Trees

  • Author

    Sem-Jacobsen, Frank Olaf ; Skeie, Tor ; Lysne, Olav ; Duato, José

  • Author_Institution
    Dept. of Informatics, Oslo Univ.
  • fYear
    2006
  • fDate
    14-18 Aug. 2006
  • Firstpage
    33
  • Lastpage
    44
  • Abstract
    Fault tolerance is critical for efficient utilisation of large computer systems. Dynamic fault tolerance allows the network to remain available through the occurance of faults as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Although dynamic fault tolerance may lead to less efficient solutions than static fault tolerance, it allows for a much higher availability of the system. In this paper we devise a dynamic fault tolerant adaptive routing algorithm for the fat tree, a much used interconnect topology, which relies on misrouting around link faults. We show that we are guaranteed to tolerate any combination of less than (num_switch_ports)/2 link faults without the need for additional network resources for deadlock freedom. There is also a high probability of tolerating an even larger number of link faults. Simulation results show that network performance degrades very little when faults are dynamically tolerated
  • Keywords
    fault tolerant computing; multiprocessor interconnection networks; telecommunication network routing; trees (mathematics); dynamic fault tolerance; dynamic fault tolerant adaptive routing; fat tree; interconnect topology; link fault misrouting; network performance; Circuit faults; Computer networks; Distributed computing; Fault tolerance; Fault tolerant systems; Integrated circuit interconnections; Network topology; Routing; Switches; Switching circuits;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2006. ICPP 2006. International Conference on
  • Conference_Location
    Columbus, OH
  • ISSN
    0190-3918
  • Print_ISBN
    0-7695-2636-5
  • Type

    conf

  • DOI
    10.1109/ICPP.2006.36
  • Filename
    1690603