DocumentCode :
155320
Title :
A novel cloud based elastic framework for big data preprocessing
Author :
Dawelbeit, Omer ; McCrindle, Rachel
Author_Institution :
Sch. of Syst. Eng., Univ. of Reading, Reading, UK
fYear :
2014
fDate :
25-26 Sept. 2014
Firstpage :
23
Lastpage :
28
Abstract :
A number of analytical big data services based on the cloud computing paradigm such as Amazon Redshift and Google Bigquery have recently emerged. These services are based on columnar databases rather than traditional Relational Database Management Systems (RDBMS) and are able to analyse massive datasets in mere seconds. This has led many organisations to retain and analyse their massive logs, sensory or marketing datasets, which were previously discarded due to the inability to either store or analyse them. Although these big data services have addressed the issue of big data analysis, the ability to efficiently de-normalise and prepare this data to a format that can be imported into these services remains a challenge. This paper describes and implements a novel, generic and scalable cloud based elastic framework for Big Data Preprocessing (BDP). Since the approach described by this paper is entirely based on cloud computing it is also possible to measure the overall cost incurred by these preprocessing activities.
Keywords :
Big Data; cloud computing; data analysis; relational databases; Amazon Redshift; BDP; Google Bigquery; RDBMS; analytical big data services; big data analysis; big data preprocessing; cloud based elastic framework; cloud computing paradigm; columnar databases; marketing datasets; massive logs; relational database management systems; Big data; Cloud computing; Computer science; Educational institutions; Google; Program processors; Runtime;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Electronic Engineering Conference (CEEC), 2014 6th
Conference_Location :
Colchester
Type :
conf
DOI :
10.1109/CEEC.2014.6958549
Filename :
6958549
Link To Document :
بازگشت