Title :
POD: Performance Oriented I/O Deduplication for Primary Storage Systems in the Cloud
Author :
Bo Mao ; Hong Jiang ; Suzhen Wu ; Lei Tian
Author_Institution :
Software Sch., Xiamen Univ., Xiamen, China
Abstract :
Recent studies have shown that moderate to high data redundancy clearly exists in primary storage systems in the Cloud. Our experimental studies reveal that data redundancy exhibits a much higher level of intensity on the I/O path than that on disks due to the relatively high temporal access locality associated with small I/O requests to redundant data. On the other hand, we also observe that directly applying data deduplication to primary storage systems in the Cloud will likely cause space contention in memory and data fragmentation on disks. Based on these observations, we propose a Performance-Oriented I/O Deduplication approach, called POD, rather than a capacity-oriented I/O deduplication approach, represented by iDedup, to improve the I/O performance of primary storage systems in the Cloud without sacrificing capacity savings of the latter. The salient feature of POD is its focus on not only the capacity-sensitive large writes and files, as in iDedup, but also the performance-sensitive while capacity-insensitive small writes and files. The experiments conducted on our lightweight prototype implementation of POD show that POD significantly outperforms iDedup in the I/O performance measure by up to 87.9% with an average of 58.8%. Moreover, our evaluation results also show that POD achieves comparable or better capacity savings than iDedup.
Keywords :
cloud computing; data handling; redundancy; storage management; I/O path intensity; POD; capacity-oriented I/O deduplication approach; cloud computing; data deduplication; data fragmentation; high data redundancy; high temporal access locality; iDedup; performance-oriented I/O deduplication approach; primary storage systems; space contention; Cache storage; Educational institutions; Indexes; Memory management; Monitoring; Prototypes; Redundancy; Data Redundancy; I/O Deduplication; I/O Performance; Primary Storage; Storage Capacity;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-3799-8
DOI :
10.1109/IPDPS.2014.84