DocumentCode :
1610954
Title :
A Fast Duplicate Chunk Identifying Method Based on Hierarchical Indexing Structure
Author :
Can Wang ; Zhi-guang Qin ; Lei Yang ; Juan Wang
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
fYear :
2012
Firstpage :
624
Lastpage :
627
Abstract :
To solve the disk bottleneck problem of deduplication system without depending on the data locality, a fast duplicate chunk identifying method based on hierarchical indexing structure is proposed. In this method, the traditional flat indexing structure is vertically divided into two layers, and only a handful of the most representative indices selected according to the Broder´s theorem are kept in the RAM. The experiment results on real data, which are lack of locality, indicate that the deduplication performance of this method can reach 87.05% of the optimal value with a far less RAM requirement than the current methods.
Keywords :
indexing; random-access storage; Broder theorem; RAM; deduplication system; disk bottleneck problem; duplicate chunk identifying method; flat indexing structure; hierarchical indexing structure; representative indices; Educational institutions; Feature extraction; Indexing; Random access memory; Throughput; Writing; data locality; deduplication; disk bottleneck; hierarchical indexing structure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on
Conference_Location :
Xi´an
Print_ISBN :
978-1-4673-1450-3
Type :
conf
DOI :
10.1109/ICICEE.2012.169
Filename :
6322458
Link To Document :
بازگشت