DocumentCode
1565772
Title
Effective ahead pipelining of instruction block address generation
Author
Seznec, André ; Fraboulet, Antoine
Author_Institution
IRISA/INRIA, France
fYear
2003
Firstpage
241
Lastpage
252
Abstract
On a N-way issue superscalar processor, the front end instruction fetch engine must deliver instructions to the execution core at a sustained rate higher than N instructions per cycle. This means that the instruction address generator/predictor (IAG) has to predict the instruction flow at an even higher rate while the prediction accuracy cannot be sacrificed. Achieving high accuracy on this prediction becomes more and more critical since the overall pipeline is becoming deeper and deeper with each new generation of processors. Then very complex IAGs featuring different predictors for jumps, returns, conditional and unconditional branches and complex logic are used. Usually, the IAG uses information (branch histories, fetch addresses, ...) available at a cycle to predict the next fetch address(es). Unfortunately, a complex IAG cannot deliver a prediction within a short cycle. Therefore, processors rely on a hierarchy of IAGs with increasing accuracies but also increasing latencies: the accurate but slow IAG is used to correct the fast, but less accurate IAG. A significant part of the potential instruction bandwidth is often wasted in pipeline bubbles due to these corrections. As an alternative to the use of a hierarchy of IAGs, it is possible to initiate the instruction address generation several cycles ahead of its use. We explore in details such an ahead pipelined IAG. The example illustrated shows that, even when the instruction address generation is (partially) initiated five cycles ahead of its use, it is possible to reach approximately the same prediction accuracy as the one of a conventional one block ahead complex IAG. The solution presented allows to deliver a sustained address generation rate close to one instruction block per cycle with state of the art accuracy.
Keywords
bandwidth allocation; instruction sets; multiprocessing systems; pipeline processing; program compilers; program control structures; storage allocation; conditional branch predictor; instruction address generator; instruction fetch bandwidth; instruction flow prediction; jump predictor; pipelined IAG; superscalar processor; unconditional branch predictor; Accuracy; Art; Bandwidth; Computer aided instruction; Delay; Engines; Hardware; History; Logic; Pipeline processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Architecture, 2003. Proceedings. 30th Annual International Symposium on
ISSN
1063-6897
Print_ISBN
0-7695-1945-8
Type
conf
DOI
10.1109/ISCA.2003.1207004
Filename
1207004
Link To Document