Title :
DS-Dedupe: A scalable, low network overhead data routing algorithm for inline cluster deduplication system
Author :
Zhen Sun ; Nong Xiao ; Fang Liu ; Yinjin Fu
Author_Institution :
State Key Lab. of High Performance Comput., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
Inline cluster deduplication technique has been widely used in data centers to improve storage efficiency. Data routing algorithm has a crucial impact on the deduplication factor, throughput and scalability in a cluster deduplication system. In this paper, we propose a stateful data routing algorithm called DS-Dedupe. To make full use of similarity in data streams, DS-Dedupe builds up a super-chunk granularity similarity index in each client to trace the super-chunks that have been routed. Then we calculate a similarity coefficient according to the index to determine whether a new super-chunk should be assigned directly or by a consistent hash, thus strike a sensible tradeoff between deduplication factor and network overhead. Our experiments on two datasets demonstrate that DS-Dedupe achieves a high elimination ratio at a low communication overhead. Besides, as data routing is operated by client node, metadata server bottleneck can be avoided.
Keywords :
computer centres; network servers; telecommunication network routing; telecommunication transmission lines; DS-Dedupe; client node; consistent hash; data centers; data streams; deduplication factor; elimination ratio; inline cluster deduplication; low communication overhead; low network overhead data routing; metadata server bottleneck; scalable data routing; sensible tradeoff; stateful data routing; super-chunk granularity similarity index; Algorithm design and analysis; Clustering algorithms; Fingerprint recognition; Indexes; Routing; Servers; Throughput; data deduplication; data routing algorithm; network overhead; scalability;
Conference_Titel :
Computing, Networking and Communications (ICNC), 2014 International Conference on
Conference_Location :
Honolulu, HI
DOI :
10.1109/ICCNC.2014.6785456