• DocumentCode
    2903847
  • Title

    Improved deduplication through parallel Binning

  • Author

    Zhike Zhang ; Bhagwat, D. ; Litwin, W. ; Long, Derek ; Schwarz, S. J. Thomas

  • Author_Institution
    Univ. of California, Santa Cruz, Santa Cruz, CA, USA
  • fYear
    2012
  • fDate
    1-3 Dec. 2012
  • Firstpage
    130
  • Lastpage
    141
  • Abstract
    Many modern storage systems use deduplication in order to compress data by avoiding storing the same data twice. Deduplication needs to use data stored in the past, but accessing information about all data stored can cause a severe bottleneck. Similarity based deduplication only accesses information on past data that is likely to be similar and thus more likely to yield good deduplication. We present an adaptive deduplication strategy that extends Extreme Binning and investigate theoretically and experimentally the effects of the additional bin accesses.
  • Keywords
    data compression; parallel processing; adaptive deduplication strategy; data compression; extreme binning; improved deduplication; parallel binning; Companies; Data structures; Feature extraction; Indexes; Probability; Random access memory; Search engines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Performance Computing and Communications Conference (IPCCC), 2012 IEEE 31st International
  • Conference_Location
    Austin, TX
  • ISSN
    1097-2641
  • Print_ISBN
    978-1-4673-4881-2
  • Type

    conf

  • DOI
    10.1109/PCCC.2012.6407746
  • Filename
    6407746