• DocumentCode
    3036656
  • Title

    Compression-Aware Algorithms for Massive Datasets

  • Author

    Brunelle, Nathan ; Robins, Gabriel ; Shelat, Abhi

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Virginia, Charlottesville, VA, USA
  • fYear
    2015
  • fDate
    7-9 April 2015
  • Firstpage
    441
  • Lastpage
    441
  • Abstract
    While massive datasets are often stored in compressed format, most algorithms are designed to operate on uncompressed data. We address this growing disconnect by developing a framework for compression-aware algorithms that operate directly on compressed datasets. Synergistically, we also propose new algorithmically-aware compression schemes that enable algorithms to efficiently process the compressed data. In particular, we apply this general methodology to geometric / CAD datasets that are ubiquitous in areas such as graphics, VLSI, and geographic information systems. We develop example algorithms and corresponding compression schemes that address different types of datasets, including point sets and graphs. Our methods are more efficient than their classical counterparts, and they extend to both lossless and lossy compression scenarios. This motivates further investigation of how this approach can enable algorithms to process ever-increasing big data volumes.
  • Keywords
    Big Data; data compression; Big Data volumes; CAD dataset; algorithmically-aware compression scheme; compression-aware algorithm; computer-aided dataset; geometric dataset; lossless compression; lossy compression; massive dataset; Algorithm design and analysis; Big data; Computer science; Data compression; Design automation; Graphics; Very large scale integration; algorithmically-aware compressions; compression-aware algorithms; geometric algorithms; graph algorithms; graph compression; pointset compression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference (DCC), 2015
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Type

    conf

  • DOI
    10.1109/DCC.2015.74
  • Filename
    7149304