DocumentCode :
2446911
Title :
Combining Virtualization, resource characterization, and Resource management to enable efficient high performance compute platforms through intelligent dynamic resource allocation
Author :
Brandt, J. ; Chen, F. ; De Sapio, V. ; Gentile, A. ; Mayo, J. ; Pébay, P. ; Roe, D. ; Thompson, D. ; Wong, M.
Author_Institution :
Sandia Nat. Labs., Livermore, CA, USA
fYear :
2010
fDate :
19-23 April 2010
Firstpage :
1
Lastpage :
8
Abstract :
Improved resource utilization and fault tolerance of large-scale HPC systems can be achieved through fine-grained, intelligent, and dynamic resource (re)allocation. We explore components and enabling technologies applicable to creating a system to provide this capability: specifically 1) Scalable fine-grained monitoring and analysis to inform resource allocation decisions, 2) Virtualization to enable dynamic reconfiguration, 3) Resource management for the combined physical and virtual resources and 4) Orchestration of the allocation, evaluation, and balancing of resources in a dynamic environment. We discuss both general and HPC-centric issues that impact the design of such a system. Finally, we present our prototype system, giving both design details and examples of its application in real-world scenarios.
Keywords :
resource allocation; software fault tolerance; system monitoring; systems analysis; fault tolerance; intelligent dynamic resource allocation; large-scale HPC systems; resource characterization; resource management; scalable fine-grained analysis; scalable fine-grained monitoring; virtualization; Condition monitoring; Environmental management; Failure analysis; High performance computing; Information analysis; Laboratories; Platform virtualization; Resource management; Resource virtualization; Technology management; HPC; IaaS; KVM; migration; resource management; virtualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4244-6533-0
Type :
conf
DOI :
10.1109/IPDPSW.2010.5470719
Filename :
5470719
Link To Document :
بازگشت