Title :
Resource conscious prefetching for irregular applications in multicores
Author :
Khan, Mahrukh ; Hagersten, Erik
Author_Institution :
Dept. of Inf. Technol., Uppsala Univ., Uppsala, Sweden
Abstract :
Many real-world applications exhibit irregular memory access patterns that cannot be handled by stream prefetchers in commodity processors. While it is possible to target irregular accesses by prefetching them in software, doing so requires a low-overhead method that ensures last-level cache and off-chip bandwidth friendly prefetching of useful data. Further, to make such approaches practical, they should ideally not require access to source code. In this work we present a low-overhead software-only method for efficient prefetching of irregular memory access patterns. The method is targeted at commodity multicores and designed to conserve shared last level cache space and off-chip bandwidth. Our approach uses low-overhead runtime sampling and statistical cache modeling to identify irregular loads that frequently miss in the cache. A cost-benefit analysis then identifies the irregular loads that can benefit from prefetching in software. This approach allows us to improve average single thread performance across 10 workloads by 10%, without dramatically increasing the off-chip bandwidth. We evaluate our method on two commodity multicores. Across 210 multi-process runs that utilize a multicore by running several different workloads in parallel, the proposed irregular software prefetching mechanism achieves up to 22% better throughput than hardware prefetching. All workload mixes benefit from our scheme, improving throughput by 9% on average.
Keywords :
cache storage; multiprocessing systems; storage management; commodity processors; cost-benefit analysis; hardware prefetching; last level cache space; last-level cache; low-overhead method; memory access pattern; multicores; off-chip bandwidth; resource conscious prefetching; software prefetching mechanism; stream prefetchers; Bandwidth; Computational modeling; Load modeling; Multicore processing; Prefetching; Runtime;
Conference_Titel :
Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV), 2014 International Conference on
Conference_Location :
Agios Konstantinos
DOI :
10.1109/SAMOS.2014.6893192