Title :
DedupT: Deduplication for tape systems
Author :
Gharaibeh, Ammar ; Constantinescu, C. ; Maohua Lu ; Routray, Ramani ; Sharma, Ashok ; Sarkar, Pradyut ; Pease, D. ; Ripeanu, Matei
Author_Institution :
Univ. of British Columbia, Vancouver, BC, Canada
Abstract :
Deduplication is a commonly-used technique on disk-based storage pools. However, deduplication has not been used for tape-based pools: tape characteristics, such as high mount and seek times combined with data fragmentation resulting from deduplication create a toxic combination that leads to unacceptably high retrieval times. This work proposes DedupT, a system that efficiently supports deduplication on tape pools. This paper (i) details the main challenges to enable efficient deduplication on tape libraries, (ii) presents a class of solutions based on graph-modeling of similarity between data items that enables efficient placement on tapes; and (iii) presents the design and evaluation of novel cross-tape and on-tape chunk placement algorithms that alleviate tape mount time overhead and reduce on-tape data fragmentation. Using 4.5 TB of real-world workloads, we show that DedupT retains at least 95% of the deduplication efficiency. We show that DedupT mitigates major retrieval time overheads, and, due to reading less data, is able to offer better restore performance compared to the case of restoring non-deduplicated data.
Keywords :
data handling; graph theory; magnetic tape storage; storage management; DedupT; cross-tape chunk placement algorithm; data item similarity; deduplication efficiency; disk-based storage pools; graph-modeling; on-tape chunk placement algorithm; on-tape data fragmentation reduction; retrieval time overhead; seek time; tape characteristics; tape libraries; tape mount time overhead; tape pool deduplication; tape systems; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data models; Databases; Libraries; Servers;
Conference_Titel :
Mass Storage Systems and Technologies (MSST), 2014 30th Symposium on
Conference_Location :
Santa Clara, CA
DOI :
10.1109/MSST.2014.6855555