Title :
DEBAR: A scalable high-performance de-duplication storage system for backup and archiving
Author :
Yang, Tianming ; Jiang, Hong ; Feng, Dan ; Niu, Zhongying ; Zhou, Ke ; Wan, Yaping
Author_Institution :
Wuhan Nat. Lab. for Optoelectron., Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
Driven by the increasing demand for large-scale and high-performance data protection, disk-based de-duplication storage has become a new research focus of the storage industry and research community where several new schemes have emerged recently. So far these systems are mainly inline de-duplication approaches, which are centralized and do not lend themselves easily to be extended to handle global de-duplication in a distributed environment. We present DEBAR, a de-duplication storage system designed to improve capacity, performance and scalability for de-duplication backup/archiving. DEBAR performs post-processing de-duplication, where backup streams are de-duplicated and cached on server-disks through an in-memory preliminary filter in phase I, and then completely de-duplicated in-batch in phase II. By decentralizing fingerprint lookup and update, DEBAR supports a cluster of servers to perform de-duplication backup in parallel, and is shown to scale linearly in both write throughput and physical capacity, achieving an aggregate throughput of 1.7GB/s and supporting a physical capacity of 2PB with 16 backup servers.
Keywords :
information retrieval systems; records management; storage allocation; DEBAR system; archiving; backup; distributed environment; high-performance data protection; postprocessing de-duplication; scalable high-performance de-duplication storage system; storage industry; Aggregates; Bandwidth; Computer industry; Computer science; Data engineering; Fingerprint recognition; Large-scale systems; Protection; Scalability; Throughput; backup and archive; data de-duplication; post-processing;
Conference_Titel :
Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4244-6442-5
DOI :
10.1109/IPDPS.2010.5470468