DocumentCode :
652247
Title :
A Comparative Study of Job Scheduling Strategies in Large-Scale Parallel Computational Systems
Author :
Chandio, Aftab Ahmed ; Cheng-Zhong Xu ; Tziritas, Nikos ; Bilal, Kashif ; Khan, Samee U.
Author_Institution :
Shenzhen Inst. of Adv. Technol., Shenzhen, China
fYear :
2013
fDate :
16-18 July 2013
Firstpage :
949
Lastpage :
957
Abstract :
With the advent of High Performance Computing (HPC) in the large-scale parallel computational environment, job scheduling and resource allocation techniques are required to deliver the Quality of Service (QoS) and resource management. Therefore, job scheduling on a large-scale parallel system has been studied to: (a) minimize the queue time and response time, and (b) maximize the overall system utilization. We compare and analyze thirteen job scheduling policies to analyze their behavior. The set of job scheduling policies include: (a) priority-based policies, (b) first fit, (c) backfilling techniques, and (d) window-based policies. All of the policies are extensively simulated and compared. A real data center workload comprised of 22385 jobs is used for simulation. We analyze the: (a) queue time, (b) response time, and (c) slowdown ratio to evaluate the policies. Moreover, we present a comprehensive workload characterization that can be used as a tool for optimizing system´s performance and for scheduler design. We investigate four categories of the workload characteristics including: (a) Narrow, (b) Wide, (c) Short, and (d) Long for detailed analysis of the schedulers´ performance. This study highlights the strengths and weakness of various job scheduling polices and helps to choose an appropriate job scheduling policy in a given scenario.
Keywords :
computer centres; parallel processing; performance evaluation; quality of service; resource allocation; scheduling; HPC; QoS; Quality of Service; comparative study; data center; high performance computing; job scheduling; job scheduling strategies; large scale parallel computational systems; parallel computational environment; queue time; resource allocation techniques; resource management; Dynamic scheduling; Optimal scheduling; Processor scheduling; Quality of service; Resource management; System performance; Data center; Job Scheduling; Large-scale Parallel Computational Systems; Workload Characterization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on
Conference_Location :
Melbourne, VIC
Type :
conf
DOI :
10.1109/TrustCom.2013.116
Filename :
6680936
Link To Document :
بازگشت