• DocumentCode
    3144139
  • Title

    Distributed cube materialization on holistic measures

  • Author

    Nandi, Arnab ; Yu, Cong ; Bohannon, Philip ; Ramakrishnan, Raghu

  • Author_Institution
    Dept. of EECS, Univ. of Michigan, Ann Arbor, MI, USA
  • fYear
    2011
  • fDate
    11-16 April 2011
  • Firstpage
    183
  • Lastpage
    194
  • Abstract
    Cube computation over massive datasets is critical for many important analyses done in the real world. Unlike commonly studied algebraic measures such as SUM that are amenable to parallel computation, efficient cube computation of holistic measures such as TOP-K is non-trivial and often impossible with current methods. In this paper we detail real-world challenges in cube materialization tasks on Web-scale datasets. Specifically, we identify an important subset of holistic measures and introduce MR-Cube, a MapReduce based framework for efficient cube computation on these measures. We provide extensive experimental analyses over both real and synthetic data. We demonstrate that, unlike existing techniques which cannot scale to the 100 million tuple mark for our datasets, MR-Cube successfully and efficiently computes cubes with holistic measures over billion-tuple datasets.
  • Keywords
    Internet; data analysis; MR-Cube; MapReduce based framework; TOP-K; Web-scale datasets; cube computation; distributed cube materialization; holistic measures; Algorithm design and analysis; Cities and towns; Current measurement; Distributed databases; Lattices; Marketing and sales; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2011 IEEE 27th International Conference on
  • Conference_Location
    Hannover
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4244-8959-6
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2011.5767884
  • Filename
    5767884