• DocumentCode
    1661609
  • Title

    Flexible coscheduling: mitigating load imbalance and improving utilization of heterogeneous resources

  • Author

    Frachtenberg, Eitan ; Feitelson, Dror G. ; Petrini, Fabrizio ; Fernandez, Juan

  • Author_Institution
    Comput. & Computational Sci. Div., Los Alamos Nat. Lab., NM, USA
  • fYear
    2003
  • Abstract
    Fine-grained parallel applications require all their processes to run simultaneously on distinct processors to achieve good efficiency. This is typically accomplished by space slicing, wherein nodes are dedicated for the duration of the run, or by gang scheduling, wherein time slicing is coordinated across processors. Both schemes suffer from fragmentation, where processors are left idle because jobs cannot be packed with perfect efficiency. Obviously, this leads to reduced utilization and sub-optimal performance. Flexible coscheduling (FCS) solves this problem by monitoring each job´s granularity and communication activity, and using gang scheduling only for those jobs that require it. Processes from other jobs, which can be scheduled without any constraints, are used as filler to reduce fragmentation. In addition, inefficiencies due to load imbalance and hardware heterogeneity are also reduced because the classification is done on a per-process basis. FCS has been fully implemented as part of the STORM resource manager, and shown to be competitive with gang scheduling and implicit coscheduling.
  • Keywords
    parallel architectures; resource allocation; workstation clusters; STORM resource manager; cluster computing; communication activity; fine-grained parallel applications; flexible coscheduling; gang scheduling; hardware heterogeneity; heterogeneous clusters; heterogeneous resources utilization; job scheduling; load balancing; load imbalance; parallel architectures; space slicing; time slicing; Application software; Concurrent computing; Grid computing; Hardware; Informatics; Laboratories; Processor scheduling; Resource management; Storms; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2003. Proceedings. International
  • ISSN
    1530-2075
  • Print_ISBN
    0-7695-1926-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2003.1213191
  • Filename
    1213191