DocumentCode
1496346
Title
Predictable High-Performance Computing Using Feedback Control and Admission Control
Author
Park, Sang-Min ; Humphrey, Marty A.
Author_Institution
Dept. of Comput. Sci., Univ. of Virginia, Charlottesville, VA, USA
Volume
22
Issue
3
fYear
2011
fDate
3/1/2011 12:00:00 AM
Firstpage
396
Lastpage
411
Abstract
Historically, batch scheduling has dominated the management of High-Performance Computing (HPC) resources. One of the most significant limitations using this approach is an inability to predict both the start time and end time of jobs. Although existing researches such as resource reservation and queue-time prediction partially address this issue, a more predictable HPC system is needed, particularly for an emerging class of adaptive real-time HPC applications. This paper presents a design and implementation of a predictable HPC system using feedback control and admission control. By creating a virtualized application layer and opportunistically multiplexing concurrent applications through the application of formal control theory, we regulate a job´s progress such that the job meets its deadline without requiring exclusive access to resources even in the presence of a wide class of unexpected events. Admission control regulates access to resources when oversubscribed. Our experimental results using five widely used applications show that the feedback and admission controller achieves highly predictable HPC system. The designed feedback controller regulates the HPC job´s progress accurately, close to the prediction by theory, thereby, showing the successful application of classic control theory to HPC workloads. In week-long experiments, over 90 percent of jobs met deadlines and the jobs missing deadlines still finished close to the requested deadlines (12.4 percent error).
Keywords
distributed processing; feedback; scheduling; virtual machines; admission control; batch scheduling; feedback control; high-performance computing; opportunistic multiplexing concurrent application; predictable HPC system; queue-time prediction; resource reservation; virtualized application layer; Multiprocessor systems; control theory.; parallel systems; scheduling;
fLanguage
English
Journal_Title
Parallel and Distributed Systems, IEEE Transactions on
Publisher
ieee
ISSN
1045-9219
Type
jour
DOI
10.1109/TPDS.2010.100
Filename
5467065
Link To Document