DocumentCode
611047
Title
Bi-Hadoop: Extending Hadoop to Improve Support for Binary-Input Applications
Author
Xiao Yu ; Bo Hong
fYear
2013
fDate
13-16 May 2013
Firstpage
245
Lastpage
252
Abstract
The MapReduce programming model, along with its open-source implementation - Hadoop - has provided a cost effective solution for many data-intensive applications. Hadoop stores data distributively and exploits data locality by assigning tasks to where data is stored. Many data-intensive applications, however, require two (or more) input data for each of their tasks. Such applications pose significant challenges for Hadoop as the inputs to one task often reside on multiple nodes, and Hadoop is unable to discover data locality in this scenario. This often leads to excessive data transfers and significant degradations in application performance. In this paper, we present Bi-Hadoop, an efficient extension of Hadoop to better support binary-input applications. Bi-Hadoop integrates an easy-to-use user interface, a binary-input aware task scheduler, and a caching subsystem. Extensive experiments show that Bi-Hadoop can significantly improve the execution of binary-input applications by reducing the data transfer overhead, and outperforms existing Hadoop by up to 3.3x.
Keywords
cache storage; data handling; public domain software; scheduling; user interfaces; Bi-Hadoop; MapReduce programming model; application performance degradation; binary-input application execution; binary-input aware task scheduler; caching subsystem; data locality; data storage; data transfer overhead reduction; data-intensive application; open-source implementation; task assignment; user interface; Data transfer; Dispatching; Scheduling algorithms; Sparse matrices; User interfaces; Vectors; Data Locality; Hadoop; MapReduce;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
Conference_Location
Delft
Print_ISBN
978-1-4673-6465-2
Type
conf
DOI
10.1109/CCGrid.2013.56
Filename
6546099
Link To Document