DocumentCode
2181103
Title
Extensions to the Pig data processing platform for scalable RDF data processing using Hadoop
Author
Tanimura, Yusuke ; Matono, Akiyoshi ; Lynden, Steven ; Kojima, Isao
Author_Institution
Inf. Technol. Res. Inst., Nat. Inst. of Adv. Ind. Sci. & Technol., Tsukuba, Japan
fYear
2010
fDate
1-6 March 2010
Firstpage
251
Lastpage
256
Abstract
In order to effectively handle the growing amount of available RDF data, a scalable and flexible RDF data processing framework is needed. We previously proposed a Hadoop-based framework, which takes advantages of scalable and fault-tolerant distributed processing technologies, originally proposed as Google´s distributed file system and MapReduce parallel model. In this paper, we present a method extending the Pig data processing platform on top of the Hadoop infrastructure. Pig compiles programs written in a high level language, called Pig Latin, into MapReduce programs that can be executed by Hadoop. In order to support RDF, Pig was extended with the ability to load and store RDF data efficiently. Furthermore, as reasoning is an important requirement for most systems storing RDF data, support for inferring new triples using entailment rules was also added. In this paper, we describe these extensions and present an evaluation of their performance.
Keywords
Java; distributed databases; fault tolerant computing; high level languages; Google; Hadoop infrastructure; MapReduce parallel model; Pig Latin; Pig data processing platform; distributed file system; entailment rules; fault tolerant distributed processing; high level language; scalable RDF data processing; Data processing; Distributed processing; File systems; Information technology; OWL; Open source software; Relational databases; Resource description framework; Scalability; Semantic Web;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on
Conference_Location
Long Beach, CA
Print_ISBN
978-1-4244-6522-4
Electronic_ISBN
978-1-4244-6521-7
Type
conf
DOI
10.1109/ICDEW.2010.5452704
Filename
5452704
Link To Document