• DocumentCode
    1783241
  • Title

    A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute Clusters

  • Author

    Coviello, Giuseppe ; Cadambi, Srihari ; Chakradhar, Srimat

  • Author_Institution
    NEC Labs. America, Inc., Princeton, NJ, USA
  • fYear
    2014
  • fDate
    19-23 May 2014
  • Firstpage
    337
  • Lastpage
    346
  • Abstract
    We propose a cluster scheduling technique for compute clusters with Xeon Phi coprocessors. Even though the Xeon Phi runs Linux which allows multiprocessing, cluster schedulers generally do not allow jobs to share coprocessors because sharing can cause oversubscription of coprocessor memory and thread resources. It has been shown that memory or thread oversubscription on a many core like the Phi results in job crashes or drastic performance loss. We first show that such an exclusive device allocation policy causes severe coprocessor underutilization: for typical workloads, on average only 38% of the Xeon Phi cores are busy across the cluster. Then, to improve coprocessor utilization, we propose a scheduling technique that enables safe coprocessor sharing without resource oversubscription. Jobs specify their maximum memory and thread requirements, and our scheduler packs as many jobs as possible on each coprocessor in the cluster, subject to resource limits. We solve this problem using a greedy approach at the cluster level combined with a knapsack-based algorithm for each node. Every coprocessor is modeled as a knapsack and jobs are packed into each knapsack with the goal of maximizing job concurrency, i.e., as many jobs as possible executing on each coprocessor. Given a set of jobs, we show that this strategy of packing for high concurrency is a good proxy for (i) reducing make span, without the need for users to specify job execution times and (ii) reducing coprocessor footprint, or the number of coprocessors required to finish the jobs without increasing make span. We implement the entire system as a seamless add on to Condor, a popular distributed job scheduler, and show make span and footprint reductions of more than 50% across a wide range of workloads.
  • Keywords
    coprocessors; greedy algorithms; multiprocessing systems; pattern clustering; processor scheduling; Condor; Linux; Xeon Phi-based compute clusters; cluster scheduling technique; coprocessor footprint reduction; coprocessor memory oversubscription; coprocessor sharing-aware scheduler; coprocessor underutilization; coprocessor utilization; distributed job scheduler; exclusive device allocation policy; greedy approach; job concurrency maximization; knapsack-based algorithm; multiprocessing; performance loss; thread oversubscription; thread resources; Concurrent computing; Coprocessors; Hardware; Instruction sets; Linux; Memory management; Servers; Middleware; coprocessors; high performance computing; processor scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4799-3799-8
  • Type

    conf

  • DOI
    10.1109/IPDPS.2014.44
  • Filename
    6877268