DocumentCode :
1170756
Title :
Software Trace Cache
Author :
Ramirez, Alex ; Larriba-Pey, Josep L. ; Valero, Mateo
Author_Institution :
Univ. Politecnica de Catalunya, Spain
Volume :
54
Issue :
1
fYear :
2005
fDate :
1/1/2005 12:00:00 AM
Firstpage :
22
Lastpage :
35
Abstract :
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying hardware resources regardless of the specific details of the processor/architecture in order to increase fetch performance. The Software Trace Cache (STC) is a code layout algorithm with a broader target than previous layout optimizations. We target not only an improvement in the instruction cache hit rate, but also an increase in the effective fetch width of the fetch engine. The STC algorithm organizes basic blocks into chains trying to make sequentially executed basic blocks reside in consecutive memory positions, then maps the basic block chains in memory to minimize conflict misses in the important sections of the program. We evaluate and analyze in detail the impact of the STC, and code layout optimizations in general, on the three main aspects of fetch performance; the instruction cache hit rate, the effective fetch width, and the branch prediction accuracy. Our results show that layout optimized, codes have some special characteristics that make them more amenable for high-performance instruction fetch. They have a very high rate of not-taken branches and execute long chains of sequential instructions; also, they make very effective use of instruction cache lines, mapping only useful instructions which will execute close in time, increasing both spatial and temporal locality.
Keywords :
cache storage; instruction sets; memory architecture; optimising compilers; parallel architectures; pipeline processing; software performance evaluation; tree data structures; Software Trace Cache code layout algorithm; block chains; branch prediction accuracy; code layout optimizations; compiler optimizations; conflict misses; fetch width; hardware resources; high-performance instruction fetch; instruction cache hit rate; instruction cache lines; memory instruction layout; not-taken branches; pipeline processors; sequential instructions; spatial locality; temporal locality; Accuracy; Computer architecture; Delay; Engines; Hardware; Optimizing compilers; Performance analysis; Pipelines; Process design; Random access memory; 65; Index Terms- Pipeline processors; branch prediction; compiler optimizations; instruction fetch; trace cache.;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2005.13
Filename :
1362637
Link To Document :
بازگشت