• DocumentCode
    1996327
  • Title

    Predict-More Router: A Low Latency NoC Router with More Route Predictions

  • Author

    Yuan He ; Sasaki, Hiromu ; Miwa, Shinsuke ; Nakamura, Hajime

  • Author_Institution
    Grad. Sch. of Eng., Univ. of Tokyo, Tokyo, Japan
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    842
  • Lastpage
    850
  • Abstract
    Network-on-Chip (NoC) is a critical part of the memory hierarchy of emerging multicores. Lowering its communication latency while preserving its bandwidth is key to achieving high system performance. By now, one of the most effective methods helps achieving this goal is prediction router (PR). PR works by predicting the route an incoming packet may be transferred to and it speculatively allocates resources (virtual channels and the switch crossbar) to the packet and traverses the packet´s flits using this predicted route in a single cycle without waiting for route computation; however, if prediction misses, the packet will then be processed in the conventional pipeline (in our work, four cycles) and the speculatively allocated router resources will be wasted. Obviously, prediction accuracy contributes to the amount of successful predictions, latency reduction and bandwidth consumption. We find that predictions hit around 65% for most applications even under the best algorithm so in such cases PR can at most accelerate about 65% of the packets while the left 35% will consume extra router resources and bandwidth. In order to increase the prediction accuracy, we propose a technique, which makes use of multiple prediction algorithms at the same time for one incoming packet. Such a prediction is more accurate. With this proposal, we design and implement predict-more router (PmR). While effectively increasing the prediction accuracy, PmR also helps utilizing remaining bandwidth within the router more productively. When both PmR and PR are evaluated under their best algorithm(s), we find that PmR is over 15% higher in prediction accuracy than PR, which helps PmR outperform PR by 3.5% on average in speeding-up the system. We also find that although PmR creates more contentions in prediction, these contentions can be well resolved and are kept within the router so both router internal bandwidth and link bandwidth are not exacerbated with it.
  • Keywords
    multiprocessing systems; network-on-chip; telecommunication network routing; Network-on-Chip; NoC router; PR; PmR; memory hierarchy; multicores; predict more router; route predictions; single cycle; switch crossbar; virtual channels; Accuracy; Bandwidth; Delays; Educational institutions; Prediction algorithms; Routing; Switches; multicore; network-on-chip; prediction; router; speculation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
  • Conference_Location
    Cambridge, MA
  • Print_ISBN
    978-0-7695-4979-8
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2013.40
  • Filename
    6650963