• DocumentCode
    2945259
  • Title

    Mixing Deduplication and Compression on Active Data Sets

  • Author

    Constantinescu, Cornel ; Glider, Joseph ; Chambliss, David

  • Author_Institution
    IBM Almaden Res. Center, San Jose, CA, USA
  • fYear
    2011
  • fDate
    29-31 March 2011
  • Firstpage
    393
  • Lastpage
    402
  • Abstract
    Many new storage systems provide some form of data reduction. We examine data reduction methods that might be suitable for emph{primary} storage systems serving active data (as contrasted with backup and archive systems), by analysis of file sets found in different active data environments. We address questions of: how effective are compression and variations of deduplication, both separately and in combination, when deduplication and compression are combined, which should be applied first, what will the tradeoff be between the different methods in their use of MIPS relative to the data reduction achieved, and what degree of data reduction should be expected for different data types.
  • Keywords
    data analysis; data compression; information retrieval; MIPS; active data environment; active data sets compression; data reduction method; deduplication mixing; primary storage system; Algorithm design and analysis; Biomedical imaging; Databases; Image coding; Portable computers; Servers; Virtual machining; data reduction; deduplication; storage systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference (DCC), 2011
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    978-1-61284-279-0
  • Type

    conf

  • DOI
    10.1109/DCC.2011.46
  • Filename
    5749497