DocumentCode
2194388
Title
A hardware mechanism for dynamic extraction and relayout of program hot spots
Author
Merten, Matthew C. ; Trick, Andrew R. ; Nystrom, Erik M. ; Barnes, Ronald D. ; Hwu, Wen-Mei W.
Author_Institution
Coordinated Sci. Lab., Urbana, IL, USA
fYear
2000
fDate
14-14 June 2000
Firstpage
59
Lastpage
70
Abstract
This paper presents a new mechanism for collecting and deploying runtime optimized code. The code-collecting component resides in the instruction retirement stage and lays out hot execution paths to improve instruction fetch rate as well as enable further code optimization. The code deployment component uses an extension to the Branch Target Buffer to migrate execution into the new code without modifying the original code. No significant delay is added to the total execution of the program due to these components. The code collection scheme enables safe runtime optimization along paths that span function boundaries. This technique provides a better platform for runtime optimization than trace caches, because the traces are longer and persist in main memory across context switches. Additionally, these traces are not as susceptible to transient behavior because they are restricted to frequently executed code. Empirical results show that on average this mechanism can achieve better instruction fetch rates using only 12 KB of hardware than a trace cache requiring 15 KB of hardware, while producing long, persistent traces more suited to optimization.
Keywords
optimising compilers; parallel architectures; code optimization; dynamic extraction; hardware mechanism; instruction fetch rate; program hot spots; relayout; runtime optimized code;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Architecture, 2000. Proceedings of the 27th International Symposium on
Conference_Location
Vancouver, BC, Canada
ISSN
1063-6897
Print_ISBN
1-58113-232-8
Type
conf
Filename
854378
Link To Document