Title :
Multi-version coding in distributed storage
Author :
Zhiying Wang ; Cadambe, Viveck
Author_Institution :
Center for Sci. of Inf., Stanford Univ., Stanford, CA, USA
fDate :
June 29 2014-July 4 2014
Abstract :
We investigate an information theoretic problem motivated by storing multiple versions of a data object in distributed storage systems. Specifically, in a storage system with n server nodes, where there are v independent message versions, each server receives message values corresponding to some arbitrary subset of the versions. The versions are assumed to be totally ordered. Each server is unaware of the set of versions at the other servers, and aims to encode the values corresponding to the versions it has. We investigate codes where, from any set of c nodes (c <; n), the value corresponding to the highest common version, as per the version ordering, available at this set of c nodes is decodable. We aim to design codes that minimize the storage cost. We present two main results in this paper. First, we show that the storage cost is lower bounded by 1 - (1 - 1/c)υ, measured in terms of the bits of the values. Second, for the cases of υ = 2 and υ = 3, we provide new code constructions that respectively achieve storage costs of (2c-1)/c2 and (3c-2)/c2, measured in terms of the bits of the values. Our code constructions are simple in that we do not code across versions. We argue that when the number of versions υ is much larger than c, then replication is close to optimal.
Keywords :
distributed processing; encoding; storage management; code constructions; data object; distributed storage systems; independent message versions; information theoretic problem; multi-version coding; storage cost minimization; Distributed databases; Encoding; Entropy; Random variables; Servers;
Conference_Titel :
Information Theory (ISIT), 2014 IEEE International Symposium on
Conference_Location :
Honolulu, HI
DOI :
10.1109/ISIT.2014.6874957