Title :
Linked Enterprise Data Model and Its Use in Real Time Analytics and Context-Driven Data Discovery
Author :
Taneja, Kunal ; Qian Zhu ; Duggan, Desmond ; Tung, Teresa
Author_Institution :
Accenture Technol. Labs., San Jose, CA, USA
fDate :
June 27 2015-July 2 2015
Abstract :
Traditional approaches for managing enterprise data revolve around a batch driven Extract Transform Load process, a one size fits all approach for storage, and an application architecture that is tightly coupled to the underlying data infrastructure. The emergence of Big Data technologies have led to the creation of alternate instantiations of the traditional approach, one where the storage systems have moved from relational databases to NoSQL technologies like HDFS. This approach to data management has been found wanting as enterprises begin to deal with complex and heterogeneous data, especially in the area of Internet of Things (IoT). IoT environments are characterized by data producers and data processing requirements. In this paper, we articulate the shortcomings of traditional approaches to data management in the context of IoT. We identify the challenges brought forth due to content heterogeneity, requirements of scale, and robustness of ETL processes, and the need to rapidly onboard and support multiple applications such as analytics. Our approach introduces the Linked Enterprise Data Model (LEDM), a knowledge representation approach derived from Linked Data for modeling and linking the disparate aspects of data infrastructure. We leverage this model in developing a scalable and robust ETL framework. The framework adopts the Lambda architecture approach and supports both stream and batch processing of incoming data. We build this capability for the streaming leg of the Lambda architecture comprising of Amazon Kinesis, Apache Spark Streaming, and Amazon Dynamo.
Keywords :
Big Data; Internet of Things; business data processing; data analysis; data models; knowledge representation; Amazon Dynamo; Amazon Kinesis; Apache Spark Streaming; ETL process robustness; HDFS; Internet of Things; IoT environment; LEDM; Lambda architecture approach; Linked Enterprise Data Model; NoSQL technology; batch driven extract transform load process; batch processing; big data technology; content heterogeneity; context-driven data discovery; data infrastructure; enterprise data management; heterogeneous data; knowledge representation approach; real time analytics; relational database; scale requirement; storage system; stream processing; Computer architecture; Data models; Object oriented modeling; Resource description framework; Silicon; Sparks; Unified modeling language; Enterprise data management; Internet of Things; Lambda architecture; Linked Data;
Conference_Titel :
Mobile Services (MS), 2015 IEEE International Conference on
Conference_Location :
New York, NY
Print_ISBN :
978-1-4673-7283-1
DOI :
10.1109/MobServ.2015.47