Title :
DraMon: Predicting memory bandwidth usage of multi-threaded programs with high accuracy and low overhead
Author :
Wei Wang ; Dey, Tamal ; Davidson, Jack W. ; Soffa, Mary Lou
Author_Institution :
Dept. of Comput. Sci., Univ. of Virginia, Charlottesville, VA, USA
Abstract :
Memory bandwidth severely limits the scalability and performance of today´s multi-core systems. Because of this limitation, many studies that focused on improving multi-core scalability rely on bandwidth usage predictions to achieve the best results. However, existing bandwidth prediction models have low accuracy, causing these studies to have inaccurate conclusions or perform sub-optimally. Most of these models make predictions based on the bandwidth usage samples of a few trial runs. Many factors that affect bandwidth usage and the complex DRAM operations are overlooked. This paper presents DraMon, a model that predicts bandwidth usages for multi-threaded programs with low overhead. It achieves high accuracy through highly accurate predictions of DRAM contention and DRAM concurrency, as well as by considering a wide range of hardware and software factors that impact bandwidth usage. We implemented two versions of DraMon: DraMon-T, a memory-trace based model, and DraMon-R, a run-time model which uses hardware performance counters. When evaluated on a real machine with memory-intensive benchmarks, DraMon-T has average accuracies of 99.17% and 94.70% for DRAM contention predictions and bandwidth predictions, respectively. DraMon-R has average accuracies of 98.55% and 93.37% for DRAM contention and bandwidth predictions respectively, with only 0.50% overhead on average.
Keywords :
DRAM chips; concurrency control; multi-threading; multiprocessing systems; storage management; system monitoring; DRAM bandwidth prediction; DRAM concurrency; DRAM contention prediction; DRAM operations; DraMon-R; DraMon-T; hardware performance counters; memory bandwidth usage prediction; memory-intensive benchmark; memory-trace based model; multicore system performance; multicore system scalability; multithreaded programs; overhead; run-time model; Accuracy; Bandwidth; Concurrent computing; Instruction sets; Mathematical model; Predictive models; Random access memory;
Conference_Titel :
High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium on
Conference_Location :
Orlando, FL
DOI :
10.1109/HPCA.2014.6835948