• DocumentCode
    710124
  • Title

    Executing queries over schemaless RDF databases

  • Author

    Aluc, Gunes ; Ozsu, M. Tamer ; Daudjee, Khuzaima ; Hartig, Olaf

  • Author_Institution
    Cheriton Sch. of Comput. Sci., Univ. of Waterloo, Waterloo, ON, Canada
  • fYear
    2015
  • fDate
    13-17 April 2015
  • Firstpage
    807
  • Lastpage
    818
  • Abstract
    Recent advances in Linked Data Management and the Semantic Web have led to a rapid increase in both the quantity as well as the variety of Web applications that rely on the SPARQL interface to query RDF data. Thus, RDF data management systems are increasingly exposed to workloads that are far more diverse and dynamic than what these systems were designed to handle. The problem is that existing systems rely on a workload-oblivious physical representation that has a fixed schema, which is not suitable for diverse and dynamic workloads. To address these issues, we propose a physical representation that is schemaless. The resulting flexibility enables an RDF dataset to be clustered based purely on the workload, which is key to achieving good performance through optimized I/O and cache utilization. Consequently, given a workload, we develop techniques to compute a good clustering of the database. We also design a new query evaluation model, namely, schemaless-evaluation that leverages this workload-aware clustering of the database whereby, with high probability, each tuple in the result set of a query is expected to be contained in at most one cluster. Our query evaluation model exploits this property to achieve better performance while ensuring fast generation of query plans without being hindered by the lack of a fixed physical schema.
  • Keywords
    cache storage; database management systems; pattern clustering; query processing; semantic Web; RDF data management systems; RDF dataset; SPARQL interface; Web applications; cache utilization; dynamic workloads; linked data management; query evaluation model; query execution; schemaless RDF databases; semantic Web; workload-aware clustering; workload-oblivious physical representation; Clustering algorithms; Computational modeling; Indexes; Query processing; Resource description framework; Standards;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2015 IEEE 31st International Conference on
  • Conference_Location
    Seoul
  • Type

    conf

  • DOI
    10.1109/ICDE.2015.7113335
  • Filename
    7113335