Title :
On Minimizing Data-Read and Download for Storage-Node Recovery
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of California, Berkeley, Berkeley, CA, USA
Abstract :
We consider the problem of efficient recovery of the data stored in any individual node of a distributed storage system, from the rest of the nodes. Applications include handling failures and degraded reads. We measure efficiency in terms of the amount of data-read and the download required. To minimize the download, we focus on the minimum bandwidth setting of the ´regenerating codes´ model for distributed storage. Under this model, the system has a total of n nodes, and the data stored in any node must be (efficiently) recoverable from any d of the other (n-1) nodes. Lower bounds on the two metrics under this model were derived previously; it has also been shown that these bounds are achievable for the amount of data-read and download when d=n-1, and for the amount of download alone when d≠ n-1. In this paper, we complete the picture by proving the converse result, that when d≠ n-1, these lower bounds are strictly loose with respect to the amount of read required. The proof is information-theoretic, and hence applies to non-linear codes as well. We also show that under two (practical) relaxations of the problem setting, these lower bounds can be met for both read and download simultaneously.
Keywords :
cache storage; distributed shared memory systems; nonlinear codes; system recovery; data-read and download; distributed storage system; nonlinear codes; regenerating codes model; storage-node recovery; Bandwidth; Data models; Distributed databases; Maintenance engineering; Symmetric matrices; Systematics; Vectors; Distributed storage; efficient-repair; regenerating codes; reliability;
Journal_Title :
Communications Letters, IEEE
DOI :
10.1109/LCOMM.2013.040213.130006