مرکز منطقه ای اطلاع رساني علوم و فناوري - Tango: a hardware-based data prefetching technique for superscalar processors

DocumentCode :

1572101

Title :

Tango: a hardware-based data prefetching technique for superscalar processors

Author :

Pinter, Shlomit S. ; Yoaz, Adi

Author_Institution :

IBM Sci. & Technol., MATAM Adv. Technol. Center, Haifa, Israel

fYear :

1996

Firstpage :

214

Lastpage :

225

Abstract :

We present a new hardware-based data prefetching mechanism for enhancing instruction level parallelism and improving the performance of superscalar processors. The emphasis in our scheme is on the effective utilization of slack time and hardware resources not used for the main computation. The scheme suggests a new hardware construct, the program progress graph (PPG), as a simple extension to the branch target buffer (BTB). We use the PPG for implementing a fast pre-program counter pre-PC, that travels only through memory reference instructions (rather than scanning all the instructions sequentially). In a single clock cycle the pre-PC extracts all the predicted memory references in some future block of instructions, to obtain early data prefetching. In addition, the PPG can be used for implementing a pre-processor and for instruction prefetching. The prefetch requests are scheduled to “range” with the core requests from the data cache, by using only free time slots on the existing data cache tag ports. Employing special methods for removing prefetch requests that are already in the cache (without utilizing the cache-tag ports bandwidth) and a simple optimization on the cache LRU mechanism reduce the number of prefetch requests sent to the core-cache bus and to the memory (second level) bus. Simulation results on the SPEC92 benchmark for the base line architecture (32 K-byte data cache and 12 cycles fetch latency) show an average speedup of 1.36 (CPI ratio)

Keywords :

digital simulation; instruction sets; parallel processing; performance evaluation; storage management; LRU mechanism; SPEC92 benchmark; Tango; base line architecture; branch target buffer; hardware resources; hardware-based data prefetching technique; instruction level parallelism; instruction prefetching; memory reference instructions; performance; program progress graph; simulation results; slack time; superscalar processors; Bandwidth; Clocks; Data mining; Delay; Electronic mail; Hardware; Microprocessors; Optimization methods; Parallel processing; Prefetching;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Microarchitecture, 1996. MICRO-29.Proceedings of the 29th Annual IEEE/ACM International Symposium on

Conference_Location :

Paris

Print_ISBN :

0-8186-7641-8

Type :

conf

DOI :

10.1109/MICRO.1996.566463

Filename :

566463

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1572101