DocumentCode :
166066
Title :
Dynamic Colocation Algorithm for Hadoop
Author :
Babu, B. Ganesh ; Shabeera, T.P. ; Madhu Kumar, S.D.
Author_Institution :
Dept. of CSE, Nat. Inst. of Technol., Calicut, India
fYear :
2014
fDate :
24-27 Sept. 2014
Firstpage :
2643
Lastpage :
2647
Abstract :
Hadoop is a widely accepted platform for developing large-scale data intensive applications. It is an open source implementation of Google´s MapReduce framework. The current data placement policy of Hadoop distributes the data among DataNodes using random placement policy for simplicity and load balance. This simple data placement is good for Hadoop applications that used to access data from a single file. But if any application needs data from different files simultaneously, the performance normally degrades. Identifying the related files and placing them in the same DataNode or in adjacent DataNodes reduces network overhead and reduces the query span. We propose a Dynamic Colocation Algorithm, where the average number of machines that are involved in processing a query decreases by colocating the datasets, that are frequently accessed together and hence reduces the network overhead. Our technique checks the relations between datasets dynamically and rearrange the datasets according to their relations. Our experimental results show that, after colocation there is a significant reduction on the execution time of MapReduce programs.
Keywords :
parallel processing; public domain software; query processing; random processes; resource allocation; DataNodes; Google MapReduce framework; Hadoop; MapReduce programs; data placement policy; dynamic colocation algorithm; large-scale data intensive applications; load balancing; network overhead; open source implementation; query span; random placement policy; Algorithm design and analysis; Availability; Clustering algorithms; Distributed databases; Heuristic algorithms; Partitioning algorithms; Data Placement; Hadoop; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-1-4799-3078-4
Type :
conf
DOI :
10.1109/ICACCI.2014.6968384
Filename :
6968384
Link To Document :
بازگشت