Author_Institution :
Inria Saclay-Ile-de-France, Univ. Paris-Sud, Orsay, France
Abstract :
The W3C´s Resource Description Framework (or RDF, in short) is a promising candidate which may deliver many of the original semi-structured data promises: flexible structure, optional schema, and rich, flexible URIs as a basis for information sharing. Moreover, RDF is uniquely positioned to benefit from the efforts of scientific communities studying databases, knowledge representation, and Web technologies. Many RDF data collections are being published, going from scientific data to general-purpose ontologies to open government data, in particular in the Linked Data movement. Managing such large volumes of RDF data is challenging, due to the sheer size, the heterogeneity, and the further complexity brought by RDF reasoning. To tackle the size challenge, distributed storage architectures are required. Cloud computing is an emerging paradigm massively adopted in many applications for the scalability, fault-tolerance and elasticity features it provides. This tutorial discusses the problems involved in efficiently handling massive amounts of RDF data in a cloud environment. We provide the necessary background, analyze and classify existing solutions, and discuss open problems and perspectives.
Keywords :
cloud computing; database management systems; fault tolerant computing; inference mechanisms; memory architecture; ontologies (artificial intelligence); RDF data collection; RDF data management; RDF reasoning; W3C; Web technology; cloud computing; cloud environment; database; distributed storage architecture; fault-tolerance; flexible URI; flexible structure; general-purpose ontology; information sharing; knowledge representation; linked data movement; open government data; optional schema; resource description framework; scientific community; scientific data; semistructured data; Cognition; Conferences; Query processing; Resource description framework; Tutorials;