مرکز منطقه ای اطلاع رساني علوم و فناوري - A Fast Duplicate Chunk Identifying Method Based on Hierarchical Indexing Structure

DocumentCode :

1610954

Title :

A Fast Duplicate Chunk Identifying Method Based on Hierarchical Indexing Structure

Author :

Can Wang ; Zhi-guang Qin ; Lei Yang ; Juan Wang

Author_Institution :

Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China

fYear :

2012

Firstpage :

624

Lastpage :

627

Abstract :

To solve the disk bottleneck problem of deduplication system without depending on the data locality, a fast duplicate chunk identifying method based on hierarchical indexing structure is proposed. In this method, the traditional flat indexing structure is vertically divided into two layers, and only a handful of the most representative indices selected according to the Broder´s theorem are kept in the RAM. The experiment results on real data, which are lack of locality, indicate that the deduplication performance of this method can reach 87.05% of the optimal value with a far less RAM requirement than the current methods.

Keywords :

indexing; random-access storage; Broder theorem; RAM; deduplication system; disk bottleneck problem; duplicate chunk identifying method; flat indexing structure; hierarchical indexing structure; representative indices; Educational institutions; Feature extraction; Indexing; Random access memory; Throughput; Writing; data locality; deduplication; disk bottleneck; hierarchical indexing structure;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on

Conference_Location :

Xi´an

Print_ISBN :

978-1-4673-1450-3

Type :

conf

DOI :

10.1109/ICICEE.2012.169

Filename :

6322458

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1610954