• DocumentCode
    244418
  • Title

    Extreme-Scale Viability of Collective Communication for Resilient Task Scheduling and Work Stealing

  • Author

    Wilke, Joachim ; Bennett, Jonathan ; Kolla, Hemanth ; Teranishi, K. ; Slattengren, Nicole ; Floren, John

  • Author_Institution
    Sandia Nat. Labs., Scalable Modeling & Anal., Livermore, CA, USA
  • fYear
    2014
  • fDate
    23-26 June 2014
  • Firstpage
    756
  • Lastpage
    761
  • Abstract
    Extreme-scale computing will bring significant changes to high performance computing system architectures. In particular, the increased number of system components is creating a need for software to demonstrate "pervasive parallelism" and resiliency. Asynchronous, many-task programming models show promise in addressing both the scalability and resiliency challenges, however, they introduce an enormously challenging distributed, resilient consistency problem. In this work, we explore the viability of resilient collective communication in task scheduling and work stealing and, through simulation with SST/macro, the performance of these collectives on speculative extreme-scale architectures.
  • Keywords
    object-oriented programming; parallel programming; scheduling; software architecture; software prototyping; SST/macro; distributed resilient consistency problem; extreme-scale architectures; extreme-scale computing; extreme-scale viability; high performance computing system architectures; many-task programming models; pervasive parallelism; resilient collective communication viability; resilient task scheduling; software resiliency; system components; work stealing; Analytical models; Bandwidth; Parallel processing; Resilience; Scalability; Three-dimensional displays; Topology; asynchronous programming models; fault tolerant collectives; structural simulation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International Conference on
  • Conference_Location
    Atlanta, GA
  • Type

    conf

  • DOI
    10.1109/DSN.2014.105
  • Filename
    6903637