DocumentCode :
3717246
Title :
Semantics for Big Data access & integration: Improving industrial equipment design through increased data usability
Author :
Jenny Weisenberg Williams;Paul Cuddihy;Justin McHugh;Kareem S. Aggour;Arvind Menon;Steven M. Gustafson;Timothy Healy
Author_Institution :
Knowledge Discovery Lab, GE Global Research, Niskayuna, NY 12309 USA
fYear :
2015
Firstpage :
1103
Lastpage :
1112
Abstract :
With the advent of Big Data technologies, organizations can efficiently store and analyze more data than ever before. However, extracting maximal value from this data can be challenging for many reasons. For example, datasets are often not stored using human-understandable terms, making it difficult for a large set of users to benefit from them. Further, given that different types of data may be best stored using different technologies, datasets that are closely related may be stored separately with no explicit linkage. Finally, even within individual data stores, there are often inconsistencies in data representations, whether introduced over time or due to different data producers. These challenges are further compounded by frequent additions to the data, including new raw data as well as results produced by large-scale analytics. Thus, even within a single Big Data environment, it is often the case that multiple rich datasets exist without the means to access them in a unified and cohesive way, often leading to lost value. This paper describes the development of a Big Data management infrastructure with semantic technologies at its core to provide a unified data access layer and a consistent approach to analytic execution. Semantic technologies were used to create domain models describing mutually relevant datasets and the relationships between them, with a graphical user interface to transparently query across datasets using domain-model terms. This prototype system was built for GE Power & Water´s Power Generation Products Engineering Division, which has produced over 50TB of gas turbine and component prototype test data to date. The system is expected to result in significant savings in productivity and expenditure.
Keywords :
"Big data","Semantics","Time series analysis","Data models","Databases","Wind turbines"
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BigData.2015.7363864
Filename :
7363864
Link To Document :
بازگشت