• DocumentCode
    710123
  • Title

    Scalable SPARQL querying using path partitioning

  • Author

    Buwen Wu ; Yongluan Zhou ; Pingpeng Yuan ; Ling Liu ; Hai Jin

  • Author_Institution
    SCTS/CGCL, Huazhong Univ. of Sci. & Technol., Wuhan, China
  • fYear
    2015
  • fDate
    13-17 April 2015
  • Firstpage
    795
  • Lastpage
    806
  • Abstract
    The emerging need for conducting complex analysis over big RDF datasets calls for scale-out solutions that can harness a computing cluster to process big RDF datasets. Queries over RDF data often involve complex self-joins, which would be very expensive to run if the data are not carefully partitioned across the cluster and hence distributed joins over massive amount of data are necessary. Existing RDF data partitioning methods can nicely localize simple queries but still need to resort to expensive distributed joins for more complex queries. In this paper, we propose a new data partitioning approach that takes use of the rich structural information in RDF datasets and minimizes the amount of data that have to be joined across different computing nodes. We conduct an extensive experimental study using two popular RDF benchmark data and one real RDF dataset that contain up to billions of RDF triples. The results indicate that our approach can produce a balanced and low redundant data partitioning scheme that can avoid or largely reduce the cost of distributed joins even for very complicated queries. In terms of query execution time, our approach can outperform the state-of-the-art methods by orders of magnitude.
  • Keywords
    data handling; query languages; RDF data partitioning methods; big RDF datasets; complex self-joins; path partitioning; query execution time; redundant data partitioning scheme; scalable SPARQL querying; scale-out solutions; Approximation algorithms; Approximation methods; Data models; Distributed databases; Merging; Partitioning algorithms; Resource description framework;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2015 IEEE 31st International Conference on
  • Conference_Location
    Seoul
  • Type

    conf

  • DOI
    10.1109/ICDE.2015.7113334
  • Filename
    7113334