DocumentCode :
117316
Title :
Characterization of semi-synthetic dataset for big-data semantic analysis
Author :
Techentin, Robert ; Foti, Daniel ; Al-Saffar, Sinan ; Li, Peter ; Daniel, Erik ; Gilbert, Barry ; Holmes, David
Author_Institution :
Mayo Clinic Coll. of Med., Rochester, MN, USA
fYear :
2014
fDate :
9-11 Sept. 2014
Firstpage :
1
Lastpage :
6
Abstract :
Over the past decade, the use of semantic databases has served as the basis for storing and analyzing complex, heterogeneous, and irregular data. While there are similarities with traditional relational database systems, semantic data stores provide a rich platform for conducting non-traditional analyses of data. In support of new graph analytic algorithms and specialized graph analytic hardware, we have developed a large semi-synthetic, semantically rich dataset. The construction of this dataset mimics the real-world scenario of using relational databases as the basis for semantic data construction. In order to achieve real-world variable distributions and variable dependencies, data.gov data was used as the basis for developing an approach to build arbitrarily large semi-synthetic datasets. The intent of the semi-synthetic dataset is to serve as a testbed for new semantic graph analyses and computational software/hardware platforms. The construction process and basic data characterization is described. All code related to the data collection, consolidation, and augmentation are available for distribution.
Keywords :
Big Data; data analysis; relational databases; semantic Web; big-data semantic analysis; computational software-hardware platforms; data.gov data; graph analytic algorithms; relational database systems; semantic data construction; semantic databases; semantic graph analyses; semisynthetic dataset characterization; specialized graph analytic hardware; Benchmark testing; Complexity theory; Data warehouses; Relational databases; Resource description framework; Semantics; RDF; big data; data.gov; graph computing; semantic representation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Extreme Computing Conference (HPEC), 2014 IEEE
Conference_Location :
Waltham, MA
Print_ISBN :
978-1-4799-6232-7
Type :
conf
DOI :
10.1109/HPEC.2014.7040994
Filename :
7040994
Link To Document :
بازگشت