• DocumentCode
    725416
  • Title

    Investigating the Resilience of Dynamic Loop Scheduling in Heterogeneous Computing Systems

  • Author

    Sukhija, Nitin ; Banicescu, Ioana ; Ciorba, Florina M.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Mississippi State Univ., Starkville, MS, USA
  • fYear
    2015
  • fDate
    June 29 2015-July 2 2015
  • Firstpage
    194
  • Lastpage
    203
  • Abstract
    To improve the performance of complex scientific applications, dynamic loop scheduling(DLS) techniques are often employed for load balancing. However, it is a challenge to select the most resilient scheduling technique for guaranteeing optimized performance of scientific applications on large-scale computing systems. Such systems comprise widely distributed and highly heterogeneous resources, and often are prone to failures. Hence, in this work we perform a comprehensive study of resilience of DLS techniques. In our study, we employed Sim Grid-based simulations. The use of a simulation framework assists in overcoming the limits of quantifying the resilience and evaluating the performance of the DLS techniques on real test beds by allowing us to model, control, and reproduce large scale computing systems with irregular behaviour in order to analyze the resilience of DLS technique son computationally intensive scientific applications. The results are used to compare the resilience of scheduling techniques under different case scenarios comprising of variable problem sizes, system sizes, characteristics of the variations in the application task computation times, and those of the processor availabilities and failures.
  • Keywords
    grid computing; resource allocation; scheduling; software performance evaluation; system recovery; DLS technique; SimGrid-based simulations; application task computation times; complex scientific applications; distributed resources; dynamic loop scheduling technique; heterogeneous computing systems; heterogeneous resources; large-scale computing systems; load balancing; performance improvement; processor availabilities; processor failures; resilient scheduling technique; simulation framework; system sizes; variable problem sizes; variation characteristics; Adaptation models; Analytical models; Computational modeling; Dynamic scheduling; Load modeling; Processor scheduling; Resilience; SimGrid; dynamic loop scheduling; resilience;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Computing (ISPDC), 2015 14th International Symposium on
  • Conference_Location
    Limassol
  • Print_ISBN
    978-1-4673-7147-6
  • Type

    conf

  • DOI
    10.1109/ISPDC.2015.29
  • Filename
    7165146