• DocumentCode
    653511
  • Title

    Data Deduplication Cluster Based on Similarity-Locality Approach

  • Author

    Xingyu Zhang ; Jian Zhang

  • Author_Institution
    Sch. of Comput. Eng. & Sci., Shanghai Univ., Shanghai, China
  • fYear
    2013
  • fDate
    20-23 Aug. 2013
  • Firstpage
    2168
  • Lastpage
    2172
  • Abstract
    Human beings have entered the big data era, and the growing data bring huge challenges for data storage. Existing deduplication methods do not work adequately in many situations. Recently, the data deduplication cluster has become an important need of most commercial and research backup systems. Data deduplication cluster becomes popular in storage system for data backup and archiving. Many researchers focus on deduplication cluster by which to reduce more redundant data. Especially block level deduplication cluster becomes popular. It is concerned to have two challenges: the chunk-lookup disk bottleneck problem and the data routing problem. A new solution is proposed for chunk-lookup disk bottleneck in our paper. The Approach of combining similarity with locality is applied to the deduplication cluster. At the same time, the bloom filter algorithm storing fingerprint is used to find more duplicate data between nodes. The system architecture and the details are provided. Finally, the experiment shows the system has a good performance.
  • Keywords
    Big Data; data compression; data structures; storage management; big data era; block level deduplication cluster; bloom filter algorithm; chunk-lookup disk bottleneck problem; data archiving; data backup systems; data deduplication cluster; data routing problem; data storage; similarity-locality approach; Filtering algorithms; Fingerprint recognition; Information filters; Information management; Servers; Throughput; bloom filter; cluster; deduplication; locality; similarity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Green Computing and Communications (GreenCom), 2013 IEEE and Internet of Things (iThings/CPSCom), IEEE International Conference on and IEEE Cyber, Physical and Social Computing
  • Conference_Location
    Beijing
  • Type

    conf

  • DOI
    10.1109/GreenCom-iThings-CPSCom.2013.409
  • Filename
    6682419