DocumentCode :
167517
Title :
Integration and Evaluation of Decentralized Fairshare Prioritization (Aequus)
Author :
Espling, Daniel ; Ostberg, Per-Olov ; Elmroth, Erik
Author_Institution :
Dept. of Comput. Sci., Umea Univ., Umea, Sweden
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
1198
Lastpage :
1207
Abstract :
Fairshare is commonly one of the factors used by cluster resource management systems to prioritize jobs during scheduling. Despite the grid vision of a transparent and unified infrastructure, fairshare is normally calculated and enforced at the local cluster level rather than at a grid-wide scale. Aequus is a self-contained decentralized system for grid-wide fairshare job prioritization. Using Aequus, detailed global share policies can be combined with local cluster policies to offer a unified grid fairshare prioritization system where local administrations retain control over their clusters. This work shows how Aequus can be integrated with local resource management systems such as SLURM and Maui with minimal intrusion. Early results from production help assess the maturity of the system, and the system is further tested and evaluated for use at a nation-wide scale using workload modeling techniques. Statistical models are created based on historical national grid usage data, and synthetic traces based on these models are used to create a diverse input set used to exemplify system behavior. The system is shown to behave consistently despite great variations in job arrival patterns and partial participation of some of the collaborating installations.
Keywords :
grid computing; resource allocation; scheduling; statistical analysis; Aequus system; Maui system; SLURM system; cluster policy; cluster resource management systems; decentralized fairshare prioritization; grid-wide fairshare job prioritization; scheduling; statistical models; Data models; Load modeling; Processor scheduling; Resource management; Scheduling; Vectors; Fairshare scheduling; Grid scheduling; Workload modeling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
Type :
conf
DOI :
10.1109/IPDPSW.2014.135
Filename :
6969517
Link To Document :
بازگشت