• DocumentCode
    2321480
  • Title

    A Bandwidth-Optimized Multi-core Architecture for Irregular Applications

  • Author

    Secchi, Simone ; Tumeo, Antonino ; Villa, Oreste

  • Author_Institution
    High Performance Comput., Pacific Northwest Nat. Lab., Richland, WA, USA
  • fYear
    2012
  • fDate
    13-16 May 2012
  • Firstpage
    580
  • Lastpage
    587
  • Abstract
    This paper presents an architecture for high performance computing systems specifically targeted to irregular applications. We show how a multi-core paradigm can benefit from next-generation memories and networks, while still resorting to fine-grained multi-threading for latency tolerance. At the same time, we also show how such an architecture template must employ specific techniques to optimize bandwidth utilization and achieve better scalability, proposing a mechanism based on remote memory references aggregation. We explore the proposed architecture template, using a custom simulation infrastructure, and validate its performance with three typical irregular applications. Our experimental results show the benefits provided by the multi-core approach, in terms of improved scalability, and by the reference aggregation technique, in terms of contention reduction and bandwidth optimization. For a configuration with 32 nodes, 8 cores and 2 memory controllers per node, the proposed bandwidth optimization technique with the best parameters achieves from 1.20 to 2.15 times higher performance and a reduction of network traffic up to 34.7% with the considered applications.
  • Keywords
    mainframes; memory architecture; multiprocessing systems; parallel architectures; performance evaluation; bandwidth utilization optimization; bandwidth-optimized multicore architecture; contention reduction; fine-grained multithreading; high performance computing systems; irregular applications; latency tolerance; memory controllers; network traffic performance; network traffic reduction; next generation memories; next generation networks; remote memory references aggregation; scalability; simulation infrastructure; Bandwidth; Hardware; Instruction sets; Memory management; Multicore processing; Pipelines; Irregular applications; bandwidth optimization; multi-core; network aggregation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on
  • Conference_Location
    Ottawa, ON
  • Print_ISBN
    978-1-4673-1395-7
  • Type

    conf

  • DOI
    10.1109/CCGrid.2012.53
  • Filename
    6217469