DocumentCode :
2213624
Title :
Scalable RDF store based on HBase and MapReduce
Author :
Sun, Jianling ; Jin, Qiang
Author_Institution :
Dept. of Comput. Sci., Zhejiang Univ., Hangzhou, China
Volume :
1
fYear :
2010
fDate :
20-22 Aug. 2010
Abstract :
The growing size of Resource Description Framework (RDF) dataset requires RDF repository to be excellent scalable and highly efficient. Distributed and parallel processing model meets the urgent needs naturally. In this paper, we propose a scalable RDF store based on HBase, which is a distributed, column-oriented database modeled after Google´s Bigtable. Our approach adopts the idea of Hexastore and considers both RDF data model and HBase capability. We store RDF triples into six HBase tables (S_PO, P_SO, O_SP, PS_O, SO_P and PO_S) which covers all combinations of RDF triple patterns. And we index them with HBase provided index structure on row key. Besides presenting the storage schema, we also propose a MapReduce strategy for SPARQL Basic Graph Pattern (BGP) processing, which is suitable for our storage schema. It uses multiple MapReduce jobs to process a typical BGP. In each job, it uses a greedy method to select join key and eliminates multiple triple patterns. The evaluation result indicates that our approach works well against large RDF dataset.
Keywords :
distributed databases; parallel processing; HBase; Hexastore; MapReduce strategy; RDF triple patterns; SPARQL basic graph pattern processing; column-oriented database; distributed database; distributed processing model; greedy method; parallel processing model; resource description framework; scalable RDF store; Analytical models; Indexes; Irrigation; Lead; Resource description framework; Semantics; Web pages; HBase; MapReduce; RDF; SPARQL; parallel processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on
Conference_Location :
Chengdu
ISSN :
2154-7491
Print_ISBN :
978-1-4244-6539-2
Type :
conf
DOI :
10.1109/ICACTE.2010.5578937
Filename :
5578937
Link To Document :
بازگشت