• DocumentCode
    3089877
  • Title

    Reliability of Clustered vs. Declustered Replica Placement in Data Storage Systems

  • Author

    Venkatesan, Vinodh ; Iliadis, Ilias ; Fragouli, Christina ; Urbanke, Rüdiger

  • Author_Institution
    IBM Res. - Zurich, Zurich, Switzerland
  • fYear
    2011
  • fDate
    25-27 July 2011
  • Firstpage
    307
  • Lastpage
    317
  • Abstract
    The placement of replicas across storage nodes in a replication-based storage system is known to affect rebuild times and therefore system reliability. Earlier work has shown that, for a replication factor of two, the reliability is essentially unaffected by the replica placement scheme because all placement schemes have mean times to data loss (MTTDLs) within a factor of two for practical values of the failure rate, storage capacity, and rebuild bandwidth of a storage node. However, for higher replication factors, simulation results reveal that this no longer holds. Moreover, an analytical derivation of MTTDL becomes intractable for general placement schemes. In this paper, we develop a theoretical model that is applicable for any replication factor and provides a good approximation of the MTTDL for small failure rates. This model characterizes the system behavior by using an analytically tractable measure of reliability: the probability of the shortest path to data loss following the first node failure. It is shown that, for highly reliable systems, this measure approximates well the probability of all paths to data loss after the first node failure and prior to the completion of rebuild, and leads to a rough estimation of the MTTDL. The results obtained are of theoretical and practical importance and are confirmed by means of simulations. As our results show, the declustered placement scheme, contrary to intuition, offers a reliability for replication factors greater than two that does not decrease as the number of nodes in the system increases.
  • Keywords
    pattern clustering; probability; storage management; data loss; data storage systems; declustered replica placement; first node failure; reliability; replication-based storage system; shortest path probability; Analytical models; Approximation methods; Bandwidth; Loss measurement; Parallel processing; Reliability theory; clustered; declustered; reliability; replica placement; storage system;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2011 IEEE 19th International Symposium on
  • Conference_Location
    Singapore
  • ISSN
    1526-7539
  • Print_ISBN
    978-1-4577-0468-0
  • Type

    conf

  • DOI
    10.1109/MASCOTS.2011.53
  • Filename
    6005375