• DocumentCode
    154062
  • Title

    CDES: An Approach to HPC Workload Modelling

  • Author

    Brennan, John ; Holmes, Violeta ; Kureshi, Ibad

  • Author_Institution
    HPC Res. Group, Univ. of Huddersfield, Huddersfield, UK
  • fYear
    2014
  • fDate
    1-3 Oct. 2014
  • Firstpage
    47
  • Lastpage
    54
  • Abstract
    Computational science and complex system administration relies on being able to model user interactions. When it comes to managing HPC, HTC and grid systems user workloads - their job submission behaviour, is an important metric when designing systems or scheduling algorithms. Most simulators are either inflexible or tied in to proprietary scheduling systems. For system administrators being able to model how a scheduling algorithm behaves or how modifying system configurations can affect the job completion rates is critical. Within computer science research many algorithms are presented with no real description or verification of behaviour. In this paper we are presenting the Cluster Discrete Event Simulator (CDES) as an strong candidate for HPC workload simulation. Built around an open framework, CDES can take system definitions, multi-platform real usage logs and can be interfaced with any scheduling algorithm through the use of an API. CDES has been tested against 3 years of usage logs from a production level HPC system and verified to a greater than 95% accuracy.
  • Keywords
    discrete event simulation; parallel processing; scheduling; CDES; CDES approach; HPC system; HPC workload modelling; HPC workload simulation; HTC system; cluster discrete event simulator; grid system; high performance computing; job submission behaviour; scheduling algorithm; system configuration; Arrays; Clustering algorithms; Computational modeling; Educational institutions; Scheduling algorithms; Testing; Torque; HPC; HPC simulator; WMS; scheduler; workload modelling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Simulation and Real Time Applications (DS-RT), 2014 IEEE/ACM 18th International Symposium on
  • Conference_Location
    Toulouse
  • ISSN
    1550-6525
  • Print_ISBN
    978-1-4799-6143-6
  • Type

    conf

  • DOI
    10.1109/DS-RT.2014.15
  • Filename
    6957176