Title :
The Significance of CMP Cache Sharing on Contemporary Multithreaded Applications
Author :
Zhang, Eddy Zheng ; Jiang, Yunlian ; Shen, Xipeng
Author_Institution :
Dept. of Comput. Sci., Coll. of William & Mary, Williamsburg, VA, USA
Abstract :
Cache sharing on modern Chip Multiprocessors (CMPs) reduces communication latency among corunning threads, and also causes interthread cache contention. Most previous studies on the influence of cache sharing have concentrated on the design or management of shared cache. The observed influence is often constrained by the reliance on simulators, the use of out-of-date benchmarks, or the limited coverage of deciding factors. This paper describes a systematic measurement of the influence with most of the potentially important factors covered. The measurement shows some surprising results. Contrary to commonly perceived importance of cache sharing, neither positive nor negative effects from the cache sharing are significant for most of the program executions in the PARSEC benchmark suite, regardless of the types of parallelism, input data sets, architectures, numbers of threads, and assignments of threads to cores. After a detailed analysis, we find that the main reason is the mismatch between the software design (and compilation) of multithreaded applications and CMP architectures. By performing source code transformations on the programs in a cache-sharing-aware manner, we observe up to 53 percent performance increase when the threads are placed on cores appropriately, confirming the software-hardware mismatch as a main reason for the observed insignificance of the influence from cache sharing, and indicating the important role of cache-sharing-aware transformations-a topic only sporadically studied so far-for exerting the power of shared cache.
Keywords :
cache storage; multi-threading; multiprocessing systems; CMP cache sharing; chip multiprocessor; interthread cache contention; multithreaded application; parallelism; program execution; shared cache design; shared cache management; software design; software-hardware mismatch; source code transformation; Arrays; Benchmark testing; Instruction sets; Libraries; Message systems; Systematics; Shared cache; chip multiprocessors.; parallel program optimizations; thread scheduling;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2011.130