• DocumentCode
    124625
  • Title

    DS-Dedupe: A scalable, low network overhead data routing algorithm for inline cluster deduplication system

  • Author

    Zhen Sun ; Nong Xiao ; Fang Liu ; Yinjin Fu

  • Author_Institution
    State Key Lab. of High Performance Comput., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2014
  • fDate
    3-6 Feb. 2014
  • Firstpage
    895
  • Lastpage
    899
  • Abstract
    Inline cluster deduplication technique has been widely used in data centers to improve storage efficiency. Data routing algorithm has a crucial impact on the deduplication factor, throughput and scalability in a cluster deduplication system. In this paper, we propose a stateful data routing algorithm called DS-Dedupe. To make full use of similarity in data streams, DS-Dedupe builds up a super-chunk granularity similarity index in each client to trace the super-chunks that have been routed. Then we calculate a similarity coefficient according to the index to determine whether a new super-chunk should be assigned directly or by a consistent hash, thus strike a sensible tradeoff between deduplication factor and network overhead. Our experiments on two datasets demonstrate that DS-Dedupe achieves a high elimination ratio at a low communication overhead. Besides, as data routing is operated by client node, metadata server bottleneck can be avoided.
  • Keywords
    computer centres; network servers; telecommunication network routing; telecommunication transmission lines; DS-Dedupe; client node; consistent hash; data centers; data streams; deduplication factor; elimination ratio; inline cluster deduplication; low communication overhead; low network overhead data routing; metadata server bottleneck; scalable data routing; sensible tradeoff; stateful data routing; super-chunk granularity similarity index; Algorithm design and analysis; Clustering algorithms; Fingerprint recognition; Indexes; Routing; Servers; Throughput; data deduplication; data routing algorithm; network overhead; scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing, Networking and Communications (ICNC), 2014 International Conference on
  • Conference_Location
    Honolulu, HI
  • Type

    conf

  • DOI
    10.1109/ICCNC.2014.6785456
  • Filename
    6785456