DocumentCode :
3089877
Title :
Reliability of Clustered vs. Declustered Replica Placement in Data Storage Systems
Author :
Venkatesan, Vinodh ; Iliadis, Ilias ; Fragouli, Christina ; Urbanke, Rüdiger
Author_Institution :
IBM Res. - Zurich, Zurich, Switzerland
fYear :
2011
fDate :
25-27 July 2011
Firstpage :
307
Lastpage :
317
Abstract :
The placement of replicas across storage nodes in a replication-based storage system is known to affect rebuild times and therefore system reliability. Earlier work has shown that, for a replication factor of two, the reliability is essentially unaffected by the replica placement scheme because all placement schemes have mean times to data loss (MTTDLs) within a factor of two for practical values of the failure rate, storage capacity, and rebuild bandwidth of a storage node. However, for higher replication factors, simulation results reveal that this no longer holds. Moreover, an analytical derivation of MTTDL becomes intractable for general placement schemes. In this paper, we develop a theoretical model that is applicable for any replication factor and provides a good approximation of the MTTDL for small failure rates. This model characterizes the system behavior by using an analytically tractable measure of reliability: the probability of the shortest path to data loss following the first node failure. It is shown that, for highly reliable systems, this measure approximates well the probability of all paths to data loss after the first node failure and prior to the completion of rebuild, and leads to a rough estimation of the MTTDL. The results obtained are of theoretical and practical importance and are confirmed by means of simulations. As our results show, the declustered placement scheme, contrary to intuition, offers a reliability for replication factors greater than two that does not decrease as the number of nodes in the system increases.
Keywords :
pattern clustering; probability; storage management; data loss; data storage systems; declustered replica placement; first node failure; reliability; replication-based storage system; shortest path probability; Analytical models; Approximation methods; Bandwidth; Loss measurement; Parallel processing; Reliability theory; clustered; declustered; reliability; replica placement; storage system;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2011 IEEE 19th International Symposium on
Conference_Location :
Singapore
ISSN :
1526-7539
Print_ISBN :
978-1-4577-0468-0
Type :
conf
DOI :
10.1109/MASCOTS.2011.53
Filename :
6005375
Link To Document :
بازگشت