Title :
Predict-More Router: A Low Latency NoC Router with More Route Predictions
Author :
Yuan He ; Sasaki, Hiromu ; Miwa, Shinsuke ; Nakamura, Hajime
Author_Institution :
Grad. Sch. of Eng., Univ. of Tokyo, Tokyo, Japan
Abstract :
Network-on-Chip (NoC) is a critical part of the memory hierarchy of emerging multicores. Lowering its communication latency while preserving its bandwidth is key to achieving high system performance. By now, one of the most effective methods helps achieving this goal is prediction router (PR). PR works by predicting the route an incoming packet may be transferred to and it speculatively allocates resources (virtual channels and the switch crossbar) to the packet and traverses the packet´s flits using this predicted route in a single cycle without waiting for route computation; however, if prediction misses, the packet will then be processed in the conventional pipeline (in our work, four cycles) and the speculatively allocated router resources will be wasted. Obviously, prediction accuracy contributes to the amount of successful predictions, latency reduction and bandwidth consumption. We find that predictions hit around 65% for most applications even under the best algorithm so in such cases PR can at most accelerate about 65% of the packets while the left 35% will consume extra router resources and bandwidth. In order to increase the prediction accuracy, we propose a technique, which makes use of multiple prediction algorithms at the same time for one incoming packet. Such a prediction is more accurate. With this proposal, we design and implement predict-more router (PmR). While effectively increasing the prediction accuracy, PmR also helps utilizing remaining bandwidth within the router more productively. When both PmR and PR are evaluated under their best algorithm(s), we find that PmR is over 15% higher in prediction accuracy than PR, which helps PmR outperform PR by 3.5% on average in speeding-up the system. We also find that although PmR creates more contentions in prediction, these contentions can be well resolved and are kept within the router so both router internal bandwidth and link bandwidth are not exacerbated with it.
Keywords :
multiprocessing systems; network-on-chip; telecommunication network routing; Network-on-Chip; NoC router; PR; PmR; memory hierarchy; multicores; predict more router; route predictions; single cycle; switch crossbar; virtual channels; Accuracy; Bandwidth; Delays; Educational institutions; Prediction algorithms; Routing; Switches; multicore; network-on-chip; prediction; router; speculation;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
DOI :
10.1109/IPDPSW.2013.40