• DocumentCode
    2787331
  • Title

    DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism

  • Author

    Choi, Byn ; Komuravelli, Rakesh ; Sung, Hyojin ; Smolinski, Robert ; Honarmand, Nima ; Adve, Sarita V. ; Adve, Vikram S. ; Carter, Nicholas P. ; Chou, Ching-Tsun

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
  • fYear
    2011
  • fDate
    10-14 Oct. 2011
  • Firstpage
    155
  • Lastpage
    166
  • Abstract
    For parallelism to become tractable for mass programmers, shared-memory languages and environments must evolve to enforce disciplined practices that ban "wild shared-memory behaviors;\´\´ e.g., unstructured parallelism, arbitrary data races, and ubiquitous non-determinism. This software evolution is a rare opportunity for hardware designers to rethink hardware from the ground up to exploit opportunities exposed by such disciplined software models. Such a co-designed effort is more likely to achieve many-core scalability than a software-oblivious hardware evolution. This paper presents DeNovo, a hardware architecture motivated by these observations. We show how a disciplined parallel programming model greatly simplifies cache coherence and consistency, while enabling a more efficient communication and cache architecture. The DeNovo coherence protocol is simple because it eliminates transient states - verification using model checking shows 15X fewer reachable states than a state-of-the-art implementation of the conventional MESI protocol. The DeNovo protocol is also more extensible. Adding two sophisticated optimizations, flexible communication granularity and direct cache-to-cache transfers, did not introduce additional protocol states (unlike MESI). Finally, DeNovo shows better cache hit rates and network traffic, translating to better performance and energy. Overall, a disciplined shared-memory programming model allows DeNovo to seamlessly integrate message passing-like interactions within a global address space for improved design complexity, performance, and efficiency.
  • Keywords
    cache storage; message passing; parallel memories; parallel programming; protocols; shared memory systems; ubiquitous computing; DeNovo coherence protocol; MESI protocol; arbitrary data race; cache architecture; cache hit rate; design complexity; direct cache-to-cache transfer; disciplined parallelism; flexible communication granularity; hardware architecture; hardware designer; many-core scalability; mass programmer; memory hierarchy; message passing; network traffic; parallel programming model; shared-memory language; shared-memory programming model; software evolution; software-oblivious hardware evolution; Arrays; Coherence; Hardware; Programming; Protocols; Software; Transient analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on
  • Conference_Location
    Galveston, TX
  • ISSN
    1089-795X
  • Print_ISBN
    978-1-4577-1794-9
  • Type

    conf

  • DOI
    10.1109/PACT.2011.21
  • Filename
    6113797