DocumentCode
140952
Title
Profiling and mining RDF data with ProLOD++
Author
Abedjan, Ziawasch ; Gruetze, Toni ; Jentzsch, Anja ; Naumann, Felix
Author_Institution
Hasso Plattner Inst. (HPI), Potsdam, Germany
fYear
2014
fDate
March 31 2014-April 4 2014
Firstpage
1198
Lastpage
1201
Abstract
Before reaping the benefits of open data to add value to an organizations internal data, such new, external datasets must be analyzed and understood already at the basic level of data types, constraints, value patterns etc. Such data profiling, already difficult for large relational data sources, is even more challenging for RDF datasets, the preferred data model for linked open data. We present ProLod++, a novel tool for various profiling and mining tasks to understand and ultimately improve open RDF data. ProLod++ comprises various traditional data profiling tasks, adapted to the RDF data model. In addition, it features many specific profiling results for open data, such as schema discovery for user-generated attributes, association rule discovery to uncover synonymous predicates, and uniqueness discovery along ontology hierarchies. ProLod++ is highly efficient, allowing interactive profiling for users interested in exploring the properties and structure of yet unknown datasets.
Keywords
data analysis; data mining; data models; ProLOD++; RDF data mining; RDF data model; RDF data profiling; association rule discovery; interactive profiling; ontology hierarchies; open RDF data; schema discovery; synonymous predicates; uniqueness discovery; user-generated attributes; Association rules; Data models; Data visualization; Ontologies; Pattern analysis; Resource description framework;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2014 IEEE 30th International Conference on
Conference_Location
Chicago, IL
Type
conf
DOI
10.1109/ICDE.2014.6816740
Filename
6816740
Link To Document