DocumentCode
187030
Title
Fast Repair for Single Failure in Erasure Coding-Based Distributed Storage Systems
Author
Huayu Zhang ; Hui Li ; Bing Zhu ; Jun Chen
Author_Institution
Shenzhen Grad. Sch., Peking Univ., Shenzhen, China
fYear
2014
fDate
6-9 Oct. 2014
Firstpage
146
Lastpage
151
Abstract
In order to guarantee data reliability in distributed storage systems, erasure codes are widely used for the desirable storage properties. Nevertheless, the codes have one drawback that overmuch data are needed to repair a failure, resulting in both large bandwidth consuming in the network and high calculation pressure on the replacement node. For repair bandwidth problem, researchers derive the tradeoffs between storage and repair traffic from network coding and propose regenerating codes. However, the constructions of regenerating codes complicate the systems as well as recovery calculation. Hence, this paper proposes a distributed repair method based on general erasure codes to mitigate the burden of both recovery computation and network traffic. We observe that distributing recovery computation among helpers can distract the whole calculation procedure and accelerate repair speed in practical systems. Furthermore, by combining this technique with network topology, we introduce a novel repair tree to minimize repair traffic. Repair tree is also derived from network coding. The performance of the repair tree is preliminarily analyzed and evaluated, which infers that the storage-bandwidth bound of regenerating codes can be broken under this model.
Keywords
distributed databases; fault tolerant computing; network coding; telecommunication network reliability; telecommunication network topology; data reliability; distributed repair method; erasure coding-based distributed storage systems; general erasure codes; network coding; network topology; network traffic; recovery calculation; recovery computation; regenerating codes; repair bandwidth problem; repair traffic; repair tree; replacement node; storage bandwidth; Bandwidth; Computational modeling; Maintenance engineering; Servers; Silicon; Strips; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Reliable Distributed Systems (SRDS), 2014 IEEE 33rd International Symposium on
Conference_Location
Nara
Type
conf
DOI
10.1109/SRDS.2014.21
Filename
6983389
Link To Document