DocumentCode :
1858718
Title :
Understanding Off-Chip Memory Contention of Parallel Programs in Multicore Systems
Author :
Tudor, Bogdan Marius ; Teo, Yong Meng ; See, Simon
Author_Institution :
Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore, Singapore
fYear :
2011
fDate :
13-16 Sept. 2011
Firstpage :
602
Lastpage :
611
Abstract :
Memory contention is an important performance issue in current multicore architectures. In this paper, we focus on understanding how off-chip memory contention affects the performance of parallel applications. Using measurements conducted on state-of-the-art multicore systems, we observed that off-chip memory traffic is not always bursty, as it was previously reported in literature. Burstiness depends on the problem size. Small problem sizes lead to bursty memory traffic, and generate small off-chip contention. In contrast, when large program sizes cause memory contention, the memory traffic is non-bursty. Based on these observations, we propose an analytical model that relates the growth of memory contention to the number of active cores and to the problem size, for both uniform (UMA) and non-uniform memory access (NUMA) systems. Our model differs from measurements on average by less than 14%. Contention for off-chip memory grows exponentially with the number of active cores, but adding additional memory controllers reduces the memory contention. For programs such as the penta diagonal solver SP from NPB benchmark, with a large matrix of $162^3$ elements (input size C), our analysis shows that memory contention increases the total number of processor cycles to execute the program by more than ten times on a machine with 24 cores.
Keywords :
multiprocessing systems; parallel processing; storage management; multicore system; nonuniform memory access; off-chip memory contention; off-chip memory traffic; parallel program; uniform memory access; Bandwidth; Hardware; Instruction sets; Memory management; Multicore processing; Process control; Random access memory; NUMA; UMA; analytical model; memory contention; multicore systems; parallel performance; performance analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing (ICPP), 2011 International Conference on
Conference_Location :
Taipei City
ISSN :
0190-3918
Print_ISBN :
978-1-4577-1336-1
Electronic_ISBN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2011.59
Filename :
6047228
Link To Document :
بازگشت