DocumentCode :
3769968
Title :
Capturing provenance for big data analytics done using SQL interface
Author :
Anu Mary Chacko;Ajeeb M Basheer;S D Madhu Kumar
Author_Institution :
Department of Computer Science and Engineering, National Institute of Technology, Calicut, India 673601
fYear :
2015
Firstpage :
1
Lastpage :
6
Abstract :
In this era of data explosion, big data research is gaining much importance. We have a collection of new technologies for data management in big data like NoSQL databases, (e.g. MongoDB, Cassandra), analytic tools (e.g. MapReduce, Hive) etc. These tools do not have a SQL query interface which users are very familiar with. So with Postgres 9.1 designers have been given an option of foreign data wrappers to interface Postgres with data stored in other data stores which may or may not be relational. Using foreign data wrappers we can link data in external data stores to Postgres interface and analyze the data residing in the datastore using SQL queries. Provenance is a metadata which captures the relation between input data and output result. This is very useful in debugging output result. PERM is a tool developed as an extension to Postgres 8.3 to make Postgres provenance aware. In this paper we present an extension of tool PERM to capture provenance for data accessed from external data stores through foreign data wrappers. The tool PERM implements PERM Influence Contribution Semantics. We propose extension to the current contribution semantics used by PERM, to capture `when´ and `who´ provenance which is important in the context of Big Data Analytics. We ported PERM to Postgres 9.3 and added new modules for capturing `when provenance´. The implementation was verified by writing Foreign data wrapper for MongoDB and performance was evaluated by writing queries for the same.
Keywords :
"Semantics","Big data","Context","Relational databases","Standards","Computers"
Publisher :
ieee
Conference_Titel :
Electrical Computer and Electronics (UPCON), 2015 IEEE UP Section Conference on
Type :
conf
DOI :
10.1109/UPCON.2015.7456749
Filename :
7456749
Link To Document :
بازگشت