DocumentCode
2446911
Title
Combining Virtualization, resource characterization, and Resource management to enable efficient high performance compute platforms through intelligent dynamic resource allocation
Author
Brandt, J. ; Chen, F. ; De Sapio, V. ; Gentile, A. ; Mayo, J. ; Pébay, P. ; Roe, D. ; Thompson, D. ; Wong, M.
Author_Institution
Sandia Nat. Labs., Livermore, CA, USA
fYear
2010
fDate
19-23 April 2010
Firstpage
1
Lastpage
8
Abstract
Improved resource utilization and fault tolerance of large-scale HPC systems can be achieved through fine-grained, intelligent, and dynamic resource (re)allocation. We explore components and enabling technologies applicable to creating a system to provide this capability: specifically 1) Scalable fine-grained monitoring and analysis to inform resource allocation decisions, 2) Virtualization to enable dynamic reconfiguration, 3) Resource management for the combined physical and virtual resources and 4) Orchestration of the allocation, evaluation, and balancing of resources in a dynamic environment. We discuss both general and HPC-centric issues that impact the design of such a system. Finally, we present our prototype system, giving both design details and examples of its application in real-world scenarios.
Keywords
resource allocation; software fault tolerance; system monitoring; systems analysis; fault tolerance; intelligent dynamic resource allocation; large-scale HPC systems; resource characterization; resource management; scalable fine-grained analysis; scalable fine-grained monitoring; virtualization; Condition monitoring; Environmental management; Failure analysis; High performance computing; Information analysis; Laboratories; Platform virtualization; Resource management; Resource virtualization; Technology management; HPC; IaaS; KVM; migration; resource management; virtualization;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
Conference_Location
Atlanta, GA
Print_ISBN
978-1-4244-6533-0
Type
conf
DOI
10.1109/IPDPSW.2010.5470719
Filename
5470719
Link To Document