DocumentCode :
1667543
Title :
Developing a Real-Time Data Analytics Framework Using Hadoop
Author :
Sangwhan Cha ; Wachowicz, Monica
Author_Institution :
Dept. of Geodesy & Geomatics Eng., Univ. of New Brunswick Fredericton, Fredericton, NB, Canada
fYear :
2015
Firstpage :
657
Lastpage :
660
Abstract :
Currently, the majority of existing workflows are based on meta-heuristics that produce good heuristics that are dynamic in nature, and map the workflow tasks to services on-the-fly, but unfortunately, they lack the ability of supporting analytical tasks considering data types and real-time processing. This paper aims to address this problem by developing a real-time data analytics framework capable of handling real-time processing of structured and unstructured data needed for performing different analytical tasks, ranging from data ingestion and processing to data exploration, and visualization. We propose architecture based on the Storm/YARN projects for data ingestion, processing exploration and visualization of streaming structured and unstructured data. We have implemented the proposed architecture using Apache Storm related APIs for both of a local mode and a distributed mode. We describe our experiments for the architecture prototype implementation and evaluate the functional requirements for each component and non-functional tests such as real time update performance and time taken for data flow among components. All components were able to handle their own functionalities properly. Also, we provide the main results for a non-functional test in order to discuss our system efficiency.
Keywords :
application program interfaces; data analysis; data structures; data visualisation; API; Apache Storm; Hadoop; architecture prototype implementation; data exploration; data ingestion; data types; data visualization; functional requirements; nonfunctional tests; real-time data analytics framework; real-time processing; system efficiency; unstructured data; Big data; Data analysis; Data visualization; Fasteners; Prototypes; Real-time systems; Storms; Hadoop; MapReduce; Real time data analytics; Storm; YARN;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (BigData Congress), 2015 IEEE International Congress on
Conference_Location :
New York, NY
Print_ISBN :
978-1-4673-7277-0
Type :
conf
DOI :
10.1109/BigDataCongress.2015.102
Filename :
7207286
Link To Document :
بازگشت