مرکز منطقه ای اطلاع رساني علوم و فناوري - A Cluster-Based Data De-duplication Technology

DocumentCode :

3571262

Title :

A Cluster-Based Data De-duplication Technology

Author :

Chuan-Mu Tseng ; Jheng-Rong Ciou ; Tzong-Jye Liu

Author_Institution :

Dept. of Appl. Digital Media, Jeh-teh Junior Coll. of Med. Nursing & Manage., Miaoli, Taiwan

fYear :

2014

Firstpage :

226

Lastpage :

230

Abstract :

Data deduplication technology usually identifies redundant data quickly and correctly by using bloom filter technology. A bloom filter can determine whether there is redundant data. However, there are the presences of false positives. In order to avoid false positives, we need to compare a new chunk with chunks that have been stored. In order to reduce the time to exclude the bloom filter false positives, current research uses many small size index tables to store chunk ID. However, the target chunk ID only stores in one index table. Searching for the target chunk ID at another index table uselessly took a great deal of time. In this paper, we cluster the stored chunks to reduce the time of excluding the false positive problem induced by bloom filter.

Keywords :

data structures; pattern clustering; bloom filter technology; chunk ID; cluster-based data deduplication technology; index table; redundant data identification; Arrays; Cloud computing; Indexes; Kernel; Linux; Multimedia communication; System performance; Bloom filter; cluster; data deduplication;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computing and Networking (CANDAR), 2014 Second International Symposium on

Type :

conf

DOI :

10.1109/CANDAR.2014.22

Filename :

7052186

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3571262