Title :
A Bayesian runtime load manager on a shared cluster
Author :
Santos, Luis Paulo ; Prenca, A.
Author_Institution :
Dept. de Inf., Minho Univ., Braga, Portugal
Abstract :
The efficient execution of irregular data parallel applications, on dynamically shared computing clusters, requires novel approaches to manage the runtime load distribution. Such environments have an unpredictable dynamic behaviour both due to the application requirements and to the available system´s resources. This uncertainty was the main motivation to propose and evaluate an application level scheduler where decisions are efficiently taken with improved accurate predictions on the environment´s current and near future state, based on available incomplete and aged measured data. Bayesian decision networks are used as the scheduler´s decision making mechanism, its effectiveness to manage the load distribution of a parallel ray tracer is assessed and compared with alternative strategies. The evaluation results, with complex scenes on a 7 shared nodes cluster with dynamically, variable workloads, show considerable performance improvements over blind strategies, and stress the benefits over a sensor based deterministic approach of identical complexity
Keywords :
belief networks; computational complexity; ray tracing; resource allocation; workstation clusters; Bayesian decision networks; Bayesian runtime load manager; application level scheduler; complexity; dynamic behaviour; dynamically shared computing clusters; parallel ray tracer; runtime load distribution; sensor based deterministic approach; shared cluster; Aging; Bayesian methods; Concurrent computing; Current measurement; Decision making; Distributed computing; Layout; Load management; Processor scheduling; Runtime;
Conference_Titel :
Cluster Computing and the Grid, 2001. Proceedings. First IEEE/ACM International Symposium on
Conference_Location :
Brisbane, Qld.
Print_ISBN :
0-7695-1010-8
DOI :
10.1109/CCGRID.2001.923259