DocumentCode :
3571262
Title :
A Cluster-Based Data De-duplication Technology
Author :
Chuan-Mu Tseng ; Jheng-Rong Ciou ; Tzong-Jye Liu
Author_Institution :
Dept. of Appl. Digital Media, Jeh-teh Junior Coll. of Med. Nursing & Manage., Miaoli, Taiwan
fYear :
2014
Firstpage :
226
Lastpage :
230
Abstract :
Data deduplication technology usually identifies redundant data quickly and correctly by using bloom filter technology. A bloom filter can determine whether there is redundant data. However, there are the presences of false positives. In order to avoid false positives, we need to compare a new chunk with chunks that have been stored. In order to reduce the time to exclude the bloom filter false positives, current research uses many small size index tables to store chunk ID. However, the target chunk ID only stores in one index table. Searching for the target chunk ID at another index table uselessly took a great deal of time. In this paper, we cluster the stored chunks to reduce the time of excluding the false positive problem induced by bloom filter.
Keywords :
data structures; pattern clustering; bloom filter technology; chunk ID; cluster-based data deduplication technology; index table; redundant data identification; Arrays; Cloud computing; Indexes; Kernel; Linux; Multimedia communication; System performance; Bloom filter; cluster; data deduplication;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing and Networking (CANDAR), 2014 Second International Symposium on
Type :
conf
DOI :
10.1109/CANDAR.2014.22
Filename :
7052186
Link To Document :
بازگشت