Title :
A performance study of software and hardware data prefetching schemes
Author :
Chen, Tien-Fu ; Baer, Jean-Loup
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
Abstract :
Prefetching, i.e., exploiting the overlap of processor computations with data accesses, is one of several approaches for tolerating memory latencies. Prefetching can be either hardware-based or software-directed or a combination of both. Hardware-based prefetching, requiring some support unit connected to the cache, can dynamically handle prefetches at run-time without compiler intervention. Software-directed approaches rely on compiler technology to insert explicit prefetch instructions. Mowry et al.´s software scheme (1991,1992) and the authors´ hardware approach (1991) are two representative schemes. In this paper, the authors evaluate approximations to these two schemes in the context of a shared-memory multiprocessor environment. Their qualitative comparisons indicate that both schemes are able to reduce cache misses in the domain of linear array references. When complex data access patterns are considered, the software approach has compile-time information to perform sophisticated prefetching whereas the hardware scheme has the advantage of manipulating dynamic information. The performance results from an instruction-level simulation of four benchmarks confirm these observations. Simulations show that the hardware scheme introduces more memory traffic into the network and that the software scheme introduces a non-negligible instruction execution overhead. An approach combining software and hardware schemes is proposed; it shows promise in reducing the memory latency with least overhead
Keywords :
memory architecture; performance evaluation; shared memory systems; storage management; benchmarks; data prefetching schemes; hardware-based; memory latencies; memory latency; overhead; performance study; shared-memory multiprocessor; software-directed; Computer science; Data engineering; Delay; Hardware; Manipulator dynamics; Prefetching; Runtime; Software performance; Telecommunication traffic; Traffic control;
Conference_Titel :
Computer Architecture, 1994., Proceedings the 21st Annual International Symposium on
Conference_Location :
Chicago, IL
Print_ISBN :
0-8186-5510-0
DOI :
10.1109/ISCA.1994.288147