Title :
Analysing Hadoop performance in a multi-user IaaS Cloud
Author :
Conejero, Javier ; Caminero, B. ; Carrion, Carmen
Author_Institution :
Dept. of Comput. Syst., Univ. of Castilla-La Mancha, Albacete, Spain
Abstract :
Over the last few years, Big Data analysis (i.e., crunching enormous amounts of data from different sources to extract useful knowledge for improving business objectives) has attracted huge attention from enterprises and research institutions. One of the most successful paradigms that has gained popularity in order to analyse this huge amount of data, is MapReduce (and particularly Hadoop, its open source implementation). However, Hadoop-based applications require massive amounts of resources in order to conduct different analysis of large amounts of data. This growing requirements that research and enterprises demand from the actual computing infrastructures empowers the Cloud computing utilization, where there is an increasing demand of Hadoop as a Service. Since Hadoop requires a distributed environment in order to operate, a significant problem is where resources are located. Focusing in Cloud environments, this problem lays mainly on the criteria for Virtual Machine (VM) placement. The work presented in this paper focuses on the analysis of performance, power consumption and resource usage by Hadoop applications when deploying Hadoop on Virtual Clusters (VCs) within a private IaaS Cloud. More precisely, the impact of different VM placement strategies on Hadoop-based application performance, power consumption and resource usage is measured. As a result, some conclusions on the optimal criteria for VM deployment are provided.
Keywords :
Big Data; cloud computing; data analysis; parallel programming; resource allocation; virtual machines; Big Data analysis; Hadoop performance; Hadoop-as-a-service; Hadoop-based applications; MapReduce; VM placement strategies; cloud computing; infrastructure-as-a-service; multiuser IaaS cloud; power consumption; resource usage; virtual clusters; virtual machines; Cloud computing; Focusing; Power demand; Resource management; Sentiment analysis; Virtual machining; Virtualization; Big-Data; Cloud; Deployment; Hadoop; IaaS; MapReduce; Performance; Quality of Service; Virtual Clusters;
Conference_Titel :
High Performance Computing & Simulation (HPCS), 2014 International Conference on
Conference_Location :
Bologna
Print_ISBN :
978-1-4799-5312-7
DOI :
10.1109/HPCSim.2014.6903713