• DocumentCode
    140827
  • Title

    Pagrol: Parallel graph olap over large-scale attributed graphs

  • Author

    Zhengkui Wang ; Qi Fan ; Huiju Wang ; Kian-Lee Tan ; Agrawal, Deepak ; El Abbadi, Amr

  • Author_Institution
    NUS Grad. Sch. of Integrative Sci. & Eng., Nat. Univ. of Singapore, Singapore, Singapore
  • fYear
    2014
  • fDate
    March 31 2014-April 4 2014
  • Firstpage
    496
  • Lastpage
    507
  • Abstract
    Attributed graphs are becoming important tools for modeling information networks, such as the Web and various social networks (e.g. Facebook, LinkedIn, Twitter). However, it is computationally challenging to manage and analyze attributed graphs to support effective decision making. In this paper, we propose, Pagrol, a parallel graph OLAP (Online Analytical Processing) system over attributed graphs. In particular, Pagrol introduces a new conceptual Hyper Graph Cube model (which is an attributed-graph analogue of the data cube model for relational DBMS) to aggregate attributed graphs at different granularities and levels. The proposed model supports different queries as well as a new set of graph OLAP Roll-Up/Drill-Down operations. Furthermore, on the basis of Hyper Graph Cube, Pagrol provides an efficient MapReduce-based parallel graph cubing algorithm, MRGraph-Cubing, to compute the graph cube for an attributed graph. Pagrol employs numerous optimization techniques: (a) a self-contained join strategy to minimize I/O cost; (b) a scheme that groups cuboids into batches so as to minimize redundant computations; (c) a cost-based scheme to allocate the batches into bags (each with a small number of batches); and (d) an efficient scheme to process a bag using a single MapReduce job. Results of extensive experimental studies using both real Facebook and synthetic datasets on a 128-node cluster show that Pagrol is effective, efficient and scalable.
  • Keywords
    data mining; graph theory; parallel algorithms; social networking (online); Facebook; MRGraph-cubing; MapReduce-based parallel graph cubing algorithm; Pagrol; conceptual hyper graph cube model; decision making; information networks; large-scale attributed graphs; numerous optimization techniques; online analytical processing; parallel graph OLAP system; self-contained join strategy; single MapReduce job; Aggregates; Cities and towns; Computational modeling; Decision making; Educational institutions; Lattices; Warehousing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2014 IEEE 30th International Conference on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/ICDE.2014.6816676
  • Filename
    6816676