• DocumentCode
    772128
  • Title

    FC3D: flow control-based distributed deadlock detection mechanism for true fully adaptive routing in wormhole networks

  • Author

    Rubio, Juan-Miguel Martínez ; López, Pedro ; Duato, José

  • Author_Institution
    Dept. of Comput. Eng., Univ. Politecnica de Valencia, Spain
  • Volume
    14
  • Issue
    8
  • fYear
    2003
  • Firstpage
    765
  • Lastpage
    779
  • Abstract
    Two general approaches have been proposed for deadlock handling in wormhole networks. Traditionally, deadlock-avoidance strategies have been used. In this case, either routing is restricted so that there are no cyclic dependencies between channels or cyclic dependencies between channels are allowed provided that there are some escape paths to avoid deadlock. More recently, deadlock recovery strategies have begun to gain acceptance. These strategies allow the use of unrestricted fully adaptive routing, usually outperforming deadlock avoidance techniques. However, they require a deadlock detection mechanism and a deadlock recovery mechanism that is able to recover from deadlocks faster than they occur. In particular, progressive deadlock recovery techniques are very attractive because they allocate a few dedicated resources to quickly deliver deadlocked messages, instead of killing them. Unfortunately, distributed deadlock detection is usually based on crude time-outs, which detect many false deadlocks. As a consequence, messages detected as deadlocked may saturate the bandwidth offered by recovery resources, thus degrading performance. Additionally, the threshold required by the detection mechanism (the time-out) strongly depends on network load, which is not known in advance at the design stage. This limits the applicability of deadlock recovery on actual networks. We propose a novel distributed deadlock detection mechanism that uses only local information, detects all the deadlocks, considerably reduces the probability of false deadlock detection over previously proposed techniques, and is not significantly affected by variations in message length and/or message destination distribution.
  • Keywords
    bandwidth allocation; message passing; multiprocessor interconnection networks; packet switching; resource allocation; system recovery; telecommunication congestion control; telecommunication network routing; FC3D mechanism; crude time-out; deadlock detection mechanism; deadlock-avoidance strategy; deadlocked message; false deadlock detection probability; flow control-based distributed deadlock detection; message destination distribution; message length distribution; network load; progressive deadlock recovery technique; wormhole network adaptive routing; wormhole switching; Adaptive control; Bandwidth; Computer Society; Distributed control; Intelligent networks; Multiprocessor interconnection networks; Programmable control; Resource management; Routing; System recovery;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2003.1225056
  • Filename
    1225056