Optimal remapping in dynamic bulk synchronous computations via a stochastic control approach

Author

Yin, G. George ; Xu, Cheng-Zhong ; Le Yi, Wang

Author_Institution

Dept. of Math., Wayne State Univ., Detroit, MI, USA

Volume

14

Issue

1

fYear

2003

fDate

1/1/2003 12:00:00 AM

Firstpage

51

Lastpage

62

Abstract

A bulk synchronous computation proceeds in phases that are separated by barrier synchronization. For dynamic bulk synchronous computations that exhibit varying phase-wise computational requirements, remapping at runtime is an effective approach to ensure parallel efficiency. The paper introduces a novel remapping strategy for computations whose workload changes can be modeled as a Markov chain. The use of a Markovian model allows us to treat statistical dependence and more complex structure than the usual independent identically distributed random variable assumptions. Our models are quite general and we do not need to impose conditions on the dynamics of the underlying process other than the transition probability matrix. It is shown that optimal remapping can be formulated as a binary decision process: remap or not at a given synchronizing instant. The optimal strategy is then developed for long lasting computations by employing optimal stopping rules in a stochastic control framework. The existence of optimal controls is established. Necessary and sufficient conditions for the optimality are obtained. Furthermore, a policy iteration algorithm is devised to reduce computational complexity and enhance fast convergence to the desired optimal control.

Keywords

Markov processes; computational complexity; parallel processing; processor scheduling; stochastic processes; synchronisation; Markov chain; Markovian model; barrier synchronization; binary decision process; bulk synchronous computation; computational complexity; dynamic bulk synchronous computations; independent identically distributed random variable assumptions; optimal controls; optimal remapping; optimal stopping rules; optimal strategy; parallel efficiency; policy iteration algorithm; remapping strategy; statistical dependence; stochastic control approach; synchronizing instant; transition probability matrix; varying phase-wise computational requirements; workload changes; Computational fluid dynamics; Computational modeling; Concurrent computing; Dynamic scheduling; Optimal control; Peer to peer computing; Probability; Random variables; Runtime; Stochastic processes;

fLanguage

English

Journal_Title

Parallel and Distributed Systems, IEEE Transactions on

Publisher

ieee

ISSN

1045-9219

Type

jour

DOI

10.1109/TPDS.2003.1167370

Filename

1167370