DocumentCode :
659439
Title :
Scalable data citation in dynamic, large databases: Model and reference implementation
Author :
Proll, Stefan ; Rauber, Andreas
Author_Institution :
SBA Res., Vienna, Austria
fYear :
2013
fDate :
6-9 Oct. 2013
Firstpage :
307
Lastpage :
312
Abstract :
Uniquely and precisely identifying and citing arbitrary subsets of data is essential in many settings, e.g. to facilitate experiment validation and data re-use in meta-studies. Current approaches relying on pointers to entire data collections or on explicit copies of data do not scale. We propose a novel approach relying on persistent, timestamped, adapted queries to versioned and timestamped data sources. Result set hashes are used for validation correctness on later re-execution. The proposed method works both for static as well as dynamically growing or changing data. Alternative implementation styles for relational databases are presented and evaluated with regard to performance issues and impact on existing applications while aiming at minimal to no additional effort requirements for data users. The approach is validated in an infrastructure monitoring domain relying on sensor data networks.
Keywords :
citation analysis; query processing; relational databases; adapted queries; data citation; data sources; dynamic database; infrastructure monitoring domain; large database; model implementation; persistent queries; reference implementation; relational databases; sensor data network; timestamped queries; Data models; History; Monitoring; Relational databases; Sorting; Standards;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
Type :
conf
DOI :
10.1109/BigData.2013.6691588
Filename :
6691588
Link To Document :
بازگشت