DocumentCode
25203
Title
Thread Row Buffers: Improving Memory Performance Isolation and Throughput in Multiprogrammed Environments
Author
Herrero, Elias ; Gonzalez, Jose ; Canal, Ramon ; Tullsen, Dean
Author_Institution
Dept. of Arquitectura Computadors, Univ. Politec. de Catalunya, Barcelona, Spain
Volume
62
Issue
9
fYear
2013
fDate
Sept. 2013
Firstpage
1879
Lastpage
1892
Abstract
The widespread adoption of chip multiprocessors in recent years has increased the number of applications simultaneously accessing DRAM memories. Therefore, memory access patterns have also changed and this has reduced row buffer locality significantly, degrading performance and energy efficiency. Furthermore, concurrent execution of applications also has shown the need of performance isolation among threads in the memory controller to enforce a quality of service in virtualized environments. Existing DRAM memories, however, enforce a tradeoff between throughput and isolation. To solve these problems, this paper proposes the addition of Thread Row Buffers (TRBs) to DRAM memories. TRBs keep an active row per thread, thereby increasing DRAM efficiency by avoiding alternate accesses to a limited number of rows and allowing the implementation of a memory scheduler not bound to the throughput-isolation tradeoff. Thread Row Buffers with Service Partitioning (TRB-SP) increase the row hit-rate by 38 percent with respect to FR-FCFS and by 11 percent with respect to Cache DRAM. This, in turn, increases overall performance by 17 and 7 percent, respectively. TRB-SP is also able to reduce the standard deviation of the memory access time of an application by 40 percent over FR-FCFS, 31 percent over PAR-BS, and 42 percent over Cache DRAM.
Keywords
DRAM chips; cache storage; energy conservation; microprocessor chips; multiprocessing systems; multiprogramming; quality of service; scheduling; DRAM memories; FR-FCFS; TRB; TRB-SP; cache DRAM; chip multiprocessors; energy efficiency; memory access patterns; memory controller; memory performance isolation; memory performance throughput; memory scheduler; multiprogrammed environments; performance degradation; quality of service; row buffer locality reduction; thread row buffer-service partitioning; throughput-isolation tradeoff; virtualized environments; Delay; Memory management; Parallel processing; Prefetching; Random access memory; Throughput; DRAM; Memory controllers; thread row buffers;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/TC.2012.173
Filename
6243133
Link To Document