Title :
Stash directory: A scalable directory for many-core coherence
Author :
Demetriades, Socrates ; Sangyeun Cho
Author_Institution :
Comput. Sci. Dept., Univ. of Pittsburgh, Pittsburgh, PA, USA
Abstract :
Maintaining coherence in large-scale chip multiprocessors (CMPs) embodies tremendous design trade-offs in meeting the area, energy and performance requirements. Sparse directory organizations represent the most energy-efficient and scalable approach towards many-core coherence. However, their limited associativity disallows the one-to-one correspondence of directory entries to cached blocks, rendering them inadequate in tracking all cached blocks. Unless the directory storage is generously over-provisioned, conflicts will force frequent invalidations of cached blocks, severely jeopardizing the system performance. As the chip area and power become increasingly precious with the growing core count, over-provisioning the directory storage becomes unsustainably costly. Stash Directory is a novel sparse directory design that allows directory entries tracking private blocks to be safely evicted without invalidating the corresponding cached blocks. By doing so, it improves system performance and increases the effective directory capacity, enabling significantly smaller directory designs. To ensure correct coherence under the new relaxed inclusion property, stash directory delegates to the last level cache the responsibility to discover hidden cached blocks when necessary, without however raising significant overhead concerns. Simulations on a 16-core CMP model show that Stash Directory can reduce space requirements to 1/8 of a conventional sparse directory, without compromising performance.
Keywords :
cache storage; multiprocessing systems; CMP; area requirements; cached blocks; chip multiprocessors; directory storage; energy requirements; last level cache; many-core coherence; performance requirements; relaxed inclusion property; stash directory; Coherence; Force; Organizations; Pollution; Program processors; Protocols; System performance;
Conference_Titel :
High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium on
Conference_Location :
Orlando, FL
DOI :
10.1109/HPCA.2014.6835928