• DocumentCode
    1883459
  • Title

    A Study on Job Co-Allocation in Multiple HPC Clusters

  • Author

    Qin, Jinhui ; Bauer, Michael

  • Author_Institution
    University of Western Ontario, Canada
  • fYear
    2006
  • fDate
    14-17 May 2006
  • Firstpage
    3
  • Lastpage
    3
  • Abstract
    To more effectively use HPC clusters for even larger computations, improve turn-around times and better utilize compute resource, users are looking to interconnect multiple HPC clusters, creating a grid. To effectively use such grids, it may be desirable to split and co-allocate jobs requiring many processes across multiple clusters. While splitting a very large job across multiple clusters is an attractive possibility, the benefit, in terms of improving turn-around time, ultimately depends on the communication patterns between processes, workload on the communication links, and the maximum bandwidth of the links. The objective of this work is to understand the impact of communications on multi-processor jobs in order to develop scheduling strategies and job allocation algorithms for multi-cluster grids which can accommodate communication factors. In this paper we report on initial investigations of some co-allocation strategies. This evaluation is based on a simulator that has been implemented and validated experimentally across two HPC clusters.
  • Keywords
    Bandwidth; Clustering algorithms; Computer networks; Computer science; Costs; Grid computing; High performance computing; Processor scheduling; Resource management; Scheduling algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Computing in an Advanced Collaborative Environment, 2006. HPCS 2006. 20th International Symposium on
  • ISSN
    1550-5243
  • Print_ISBN
    0-7695-2582-2
  • Type

    conf

  • DOI
    10.1109/HPCS.2006.8
  • Filename
    1628194