Title :
Robust resource allocations in parallel computing systems: model and heuristics
Author :
Shestak, Vladimir ; Siegel, Howard Jay ; Maciejewski, Anthony A. ; Ali, Shoukat
Author_Institution :
Dept. of Electr. & Comput. Eng., Colorado State Univ., Fort Collins, CO, USA
Abstract :
The resources in parallel computer systems (including heterogeneous clusters) should be allocated to the computational applications in a way that maximizes some system performance measure. However, allocation decisions and associated performance prediction are often based on estimated values of application and system parameters. The actual values of these parameters may differ from the estimates; for example, the estimates may represent only average values, the models used to generate the estimates may have limited accuracy, and there may be changes in the environment. Thus, an important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. To address this problem, we have designed a model for deriving the degree of robustness of a resource allocation-the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. The model is presented and we demonstrate its ability to select the most robust resource allocation from among those that otherwise perform similarly (based oh the primary performance criterion). The model´s use in allocation heuristics is also demonstrated. This model is applicable to different types of computing and communication environments, including parallel, distributed, cluster; grid, Internet, embedded, and wireless.
Keywords :
parallel processing; quality of service; resource allocation; QoS; parallel computing systems; quality of service; resource management strategy; robust resource allocation; system performance; Application software; Computer applications; Concurrent computing; Distributed computing; Embedded computing; Parallel processing; Resource management; Robustness; System performance; Uncertainty;
Conference_Titel :
Parallel Architectures,Algorithms and Networks, 2005. ISPAN 2005. Proceedings. 8th International Symposium on
Print_ISBN :
0-7695-2509-1
DOI :
10.1109/ISPAN.2005.75