Title :
Joint Dimensioning of Server and Network Infrastructure for Resilient Optical Grids/Clouds
Author :
Develder, Chris ; Buysse, Jens ; Dhoedt, Bart ; Jaumard, Brigitte
Author_Institution :
Dept. of Inf. Technol., Ghent Univ., Ghent, Belgium
Abstract :
We address the dimensioning of infrastructure, comprising both network and server resources, for large-scale decentralized distributed systems such as grids or clouds. We design the resulting grid/cloud to be resilient against network link or server failures. To this end, we exploit relocation: Under failure conditions, a grid job or cloud virtual machine may be served at an alternate destination (i.e., different from the one under failure-free conditions). We thus consider grid/cloud requests to have a known origin, but assume a degree of freedom as to where they end up being served, which is the case for grid applications of the bag-of-tasks (BoT) type or hosted virtual machines in the cloud case. We present a generic methodology based on integer linear programming (ILP) that: chooses a given number of sites in a given network topology where to install server infrastructure; and determines the amount of both network and server capacity to cater for both the failure-free scenario and failures of links or nodes. For the latter, we consider either failure-independent (FID) or failure-dependent (FD) recovery. Case studies on European-scale networks show that relocation allows considerable reduction of the total amount of network and server resources, especially in sparse topologies and for higher numbers of server sites. Adopting a failure-dependent backup routing strategy does lead to lower resource dimensions, but only when we adopt relocation (especially for a high number of server sites): Without exploiting relocation, potential savings of FD versus FID are not meaningful.
Keywords :
cloud computing; grid computing; integer programming; linear programming; network servers; optical fibre networks; optical links; telecommunication computing; telecommunication network reliability; telecommunication network routing; telecommunication network topology; virtual machines; BoT type; European-scale network; FD recovery; FID recovery; ILP; bag-of-task type; cloud virtual machine; degree of freedom; failure-dependent backup routing strategy; failure-dependent recovery; failure-free scenario; failure-independent recovery; integer linear programming; large-scale decentralized distributed system; link failure; network resource infrastructure; resilient optical grid-cloud; server resource infrastructure; sparse network topology; Bandwidth; Data models; Network topology; Optical fiber networks; Routing; Servers; WDM networks; Anycast; cloud computing; column generation; dimensioning; grid computing; integer linear programming (ILP); linear programming; optical networks;
Journal_Title :
Networking, IEEE/ACM Transactions on
DOI :
10.1109/TNET.2013.2283924