• DocumentCode
    610404
  • Title

    Materialization strategies in the Vertica analytic database: Lessons learned

  • Author

    Shrinivas, L. ; Bodagala, S. ; Varadarajan, Ravi ; Cary, A. ; Bharathan, V. ; Bear, C.

  • Author_Institution
    Vertica Syst., HP Co., Cambridge, MA, USA
  • fYear
    2013
  • fDate
    8-12 April 2013
  • Firstpage
    1196
  • Lastpage
    1207
  • Abstract
    Column store databases allow for various tuple reconstruction strategies (also called materialization strategies). Early materialization is easy to implement but generally performs worse than late materialization. Late materialization is more complex to implement, and usually performs much better than early materialization, although there are situations where it is worse. We identify these situations, which essentially revolve around joins where neither input fits in memory (also called spilling joins). Sideways information passing techniques provide a viable solution to get the best of both worlds. We demonstrate how early materialization combined with sideways information passing allows us to get the benefits of late materialization, without the bookkeeping complexity or worse performance for spilling joins. It also provides some other benefits to query processing in Vertica due to positive interaction with compression and sort orders of the data. In this paper, we report our experiences with late and early materialization, highlight their strengths and weaknesses, and present the details of our sideways information passing implementation. We show experimental results of comparing these materialization strategies, which highlight the significant performance improvements provided by our implementation of sideways information passing (up to 72% on some TPC-H queries).
  • Keywords
    data compression; database management systems; query processing; storage management; TPC-H query; Vertica analytic database; bookkeeping complexity; column store database; data compression; late materialization; materialization strategy; memory; query processing; sideways information passing technique; spilling joins; tuple reconstruction strategy; Complexity theory; Containers; Context; Data models; Engines; Query processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2013 IEEE 29th International Conference on
  • Conference_Location
    Brisbane, QLD
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4673-4909-3
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2013.6544909
  • Filename
    6544909