DocumentCode :
1917863
Title :
Adaptive Metric-Aware Job Scheduling for Production Supercomputers
Author :
Tang, Wei ; Ren, Dongxu ; Lan, Zhiling ; Desai, Narayan
Author_Institution :
Dept. of Comput. Sci., Illinois Inst. of Technol., Chicago, IL, USA
fYear :
2012
fDate :
10-13 Sept. 2012
Firstpage :
107
Lastpage :
115
Abstract :
Job scheduling is a critical and complex task on large-scale supercomputers where a scheduling policy is expected to fulfill amorphous and sometimes conflicting goals from both users and system owners. Moreover, the effectiveness of a scheduling policy is dependent on workload characteristics which vary from time to time. Thus it is challenging to design a versatile scheduling policy that is effective in all circumstances. To address this issue, we propose an adaptive metric-aware job scheduling strategy. First, we propose metric-aware scheduling which enables the scheduler to balance competing scheduling goals represented by different metrics such as job waiting time, fairness, and system utilization. Second, we enhance the scheduler to adaptively adjust scheduling policies based on feedback information of monitored metrics at runtime. We evaluate our design using real workloads from supercomputer centers and demonstrate that our scheduling mechanism can significantly improve system performance in a balanced, sustainable fashion.
Keywords :
parallel machines; processor scheduling; adaptive metric-aware job scheduling; feedback information; job waiting time; large-scale supercomputer; production supercomputer; scheduling policy; workload characteristic; Measurement; Monitoring; Processor scheduling; Resource management; Schedules; Scheduling; Tuning; adaptive policy tuning; job scheduling; metric-aware; resource management; supercomputer;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing Workshops (ICPPW), 2012 41st International Conference on
Conference_Location :
Pittsburgh, PA
ISSN :
1530-2016
Print_ISBN :
978-1-4673-2509-7
Type :
conf
DOI :
10.1109/ICPPW.2012.17
Filename :
6337469
Link To Document :
بازگشت