• DocumentCode
    3435042
  • Title

    P-Dedupe: Exploiting Parallelism in Data Deduplication System

  • Author

    Xia, Wen ; Jiang, Hong ; Feng, Dan ; Tian, Lei ; Fu, Min ; Wang, Zhongtao

  • Author_Institution
    Sch. of Comput., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • fYear
    2012
  • fDate
    28-30 June 2012
  • Firstpage
    338
  • Lastpage
    347
  • Abstract
    Data deduplication, an efficient space reduction method, has gained increasing attention and popularity in data-intensive storage systems. Most existing state-of-the-art deduplication methods remove redundant data at either the file level or the chunk level, which incurs unavoidable and significant overheads in time (due to chunking and fingerprinting). These overheads can degrade the write performance to an unacceptable level in a data storage system. In this paper, we propose P-Dedupe, a fast and scalable deduplication system. The main idea behind P-Dedupe is to fully compose pipelined and parallel computations of data deduplication by effectively exploiting the idle resources of modern computer systems with multi-core and many-core processor architectures. Our experimental evaluation of the P-Dedupe prototype based on real-world datasets shows that P-Dedupe speeds up the deduplication write throughput by a factor of 2~4 through pipelining deduplication and parallelizing hash calculation and achieves 80%~250% of the performance of a conventional storage system without data deduplication.
  • Keywords
    multiprocessing systems; parallel architectures; pipeline processing; storage management; P-Dedupe; data deduplication system; data storage system; data-intensive storage system; many-core processor architecture; multicore processor architecture; parallel computation; parallelizing hash calculation; pipelined computation; pipelining deduplication; redundant data removal; space reduction method; Multicore processing; Pipeline processing; Power capacitors; Throughput; Writing; Chunking; Deduplication; Parallelism;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networking, Architecture and Storage (NAS), 2012 IEEE 7th International Conference on
  • Conference_Location
    Xiamen, Fujian
  • Print_ISBN
    978-1-4673-1889-1
  • Type

    conf

  • DOI
    10.1109/NAS.2012.46
  • Filename
    6310788