DocumentCode :
170718
Title :
Distributed data storage systems with opportunistic repair
Author :
Aggarwal, Vaneet ; Chao Tian ; Vaishampayan, Vinay A. ; Chen, Y.-F.R.
Author_Institution :
AT&T Labs.-Res., Florham Park, NJ, USA
fYear :
2014
fDate :
April 27 2014-May 2 2014
Firstpage :
1833
Lastpage :
1841
Abstract :
The reliability of erasure-coded distributed storage systems, as measured by the mean time to data loss (MTTDL), depends on the repair bandwidth of the code. Repair-efficient codes provide reliability values several orders of magnitude better than conventional erasure codes. Current state of the art codes fix the number of helper nodes (nodes participating in repair) a priori. In practice, however, it is desirable to allow the number of helper nodes to be adaptively determined by the network traffic conditions. In this work, we propose an opportunistic repair framework to address this issue. It is shown that there exists a threshold on the storage overhead, below which such an opportunistic approach does not lose any efficiency from the optimal storage-repair-bandwidth tradeoff; i.e. it is possible to construct a code simultaneously optimal for different numbers of helper nodes. We further examine the benefits of such opportunistic codes, and derive the MTTDL improvement for two repair models: one with limited total repair bandwidth and the other with limited individual-node repair bandwidth. In both settings, we show orders of magnitude improvement in MTTDL. Finally, the proposed framework is examined in a network setting where a significant improvement in MTTDL is observed.
Keywords :
storage area networks; storage management; telecommunication traffic; MTTDL; erasure-coded distributed data storage system reliability; helper nodes; limited individual-node repair bandwidth; limited total repair bandwidth; mean time-to-data loss; network traffic conditions; opportunistic codes; opportunistic repair framework; optimal storage-repair-bandwidth tradeoff; reliability values; repair models; repair-efficient codes; storage overhead; Bandwidth; Computers; Conferences; Loss measurement; Maintenance engineering; Peer-to-peer computing; Reliability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
INFOCOM, 2014 Proceedings IEEE
Conference_Location :
Toronto, ON
Type :
conf
DOI :
10.1109/INFOCOM.2014.6848122
Filename :
6848122
Link To Document :
بازگشت