DocumentCode :
1760682
Title :
Control-Flow Decoupling: An Approach for Timely, Non-Speculative Branching
Author :
Sheikh, Rami ; Tuck, James ; Rotenberg, Eric
Author_Institution :
Qualcomm Res., Raleigh, NC, USA
Volume :
64
Issue :
8
fYear :
2015
fDate :
Aug. 1 2015
Firstpage :
2182
Lastpage :
2203
Abstract :
Mobile and PC/server class processor companies continue to roll out flagship core microarchitectures that are faster than their predecessors. Meanwhile placing more cores on a chip coupled with constant supply voltage puts per-core energy consumption at a premium. Hence, the challenge is to find future microarchitecture optimizations that not only increase performance but also conserve energy. Eliminating branch mispredictions-which waste both time and energy-is valuable in this respect. In this paper, we explore the control-flow landscape by characterizing mispredictions in four benchmark suites. We find that a third of mispredictions-per-1K-instructions (MPKI) come from what we call separable branches: branches with large control-dependent regions (not suitable for if-conversion), whose backward slices do not depend on their control-dependent instructions or have only a short dependence. We propose control-flow decoupling (CFD) to eradicate mispredictions of separable branches. The idea is to separate the loop containing the branch into two loops: the first contains only the branch´s predicate computation and the second contains the branch and its control-dependent instructions. The first loop communicates branch outcomes to the second loop through an architectural queue. Microarchitecturally, the queue resides in the fetch unit to drive timely, non-speculative branching. On a microarchitecture configured similar to Intel´s Sandy Bridge core, CFD increases performance by up to 55 percent, and reduces energy consumption by up to 49 percent (for CFD regions). Moreover, for some applications, CFD is a necessary catalyst for future complexity-effective large-window architectures to tolerate memory latency.
Keywords :
computer architecture; energy conservation; microprocessor chips; power aware computing; CFD; Intel´s Sandy Bridge core; MPKI; PC/server class processor companies; architectural queue; complexity-effective large-window architectures; control-dependent instructions; control-flow decoupling; control-flow landscape; flagship core microarchitectures; memory latency; microarchitecture optimizations; mispredictions-per-1K-instructions; mobile companies; nonspeculative branching; per-core energy consumption; supply voltage; Computational fluid dynamics; Energy consumption; Ground penetrating radar; Hardware; Microarchitecture; Multicore processing; Software; Microarchitecture; branch prediction; instruction level parallelism; isa extensions; pre-execution; predication; separable branches; software/hardware codesign;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2014.2361526
Filename :
6915862
Link To Document :
بازگشت