Title :
Exploiting criticality to reduce bottlenecks in distributed uniprocessors
Author :
Robatmili, Behnam ; Govindan, Sibi ; Burger, Doug ; Keckler, Stephen W.
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at Austin, Austin, TX, USA
Abstract :
Composable multicore systems merge multiple independent cores for running sequential single-threaded workloads. The performance scalability of these systems, however, is limited due to partitioning overheads. This paper addresses two of the key performance scalability limitations of composable multicore systems. We present a critical path analysis revealing that communication needed for cross-core register value delivery and fetch stalls due to misspeculation are the two worst bottlenecks that prevent efficient scaling to a large number of fused cores. To alleviate these bottlenecks, this paper proposes a fully distributed framework to exploit criticality in these architectures at different granularities. A coordinator core exploits different types of block-level communication criticality information to fine-tune critical instructions at decode and register forward pipeline stages of their executing cores. The framework exploits the fetch criticality information at a coarser granularity by reissuing all instructions in the blocks previously fetched into the merged cores. This general framework reduces competing bottlenecks in a synergic manner and achieves scalable performance/power efficiency for sequential programs when running across a large number of cores.
Keywords :
microprocessor chips; multiprocessing systems; performance evaluation; pipeline processing; block level communication criticality information; coarse granularity; composable multicore system; critical path analysis; cross core register value delivery; distributed uniprocessor; fetch criticality information; fetch stalls; fine tune critical instruction; misspeculation; partitioning overhead; performance scalability limitation; sequential single threaded workload; Bandwidth; Benchmark testing; Hardware; Microarchitecture; Multicore processing; Pipelines; Registers;
Conference_Titel :
High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on
Conference_Location :
San Antonio, TX
Print_ISBN :
978-1-4244-9432-3
DOI :
10.1109/HPCA.2011.5749749