DocumentCode
659554
Title
NativeTask: A Hadoop compatible framework for high performance
Author
Dong Yang ; Xiang Zhong ; Dong Yan ; Fangqin Dai ; Xusen Yin ; Cheng Lian ; Zhongliang Zhu ; Weihua Jiang ; Gansha Wu
Author_Institution
Intel Corp., Beijing, China
fYear
2013
fDate
6-9 Oct. 2013
Firstpage
94
Lastpage
101
Abstract
Although Hadoop MapReduce provides good programming abstractions and horizontal scalability, it is often blamed for its poor single node performance. In the meantime, MapReduce has already achieved a large install base, thus any performance improvement should keep the compatibility. In this paper, we address the challenges via several approaches guided by low-level performance analysis. And we materialize the approaches via NativeTask, a high-performance, fully compatible MapReduce execution engine. We evaluate its performance with representative HiBench workloads. The results show that the speedup NativeTask achieves ranges from 10% to 160%, and it paves the way for a better MapReduce that excels on both single node performance and scalability. In the future, hardware acceleration can also be applied to further improve the system´s efficiency.
Keywords
distributed processing; software performance evaluation; Hadoop MapReduce; Hadoop compatible framework; NativeTask; hardware acceleration; high performance; horizontal scalability; low-level performance analysis; programming abstractions; representative HiBench workloads; system efficiency; Data processing; Engines; Java; Libraries; Optimization; Random access memory; Sorting; C++ implementation; CPU-bound application; Hadoop; cache-oblivious sort; compatibility; high performance;
fLanguage
English
Publisher
ieee
Conference_Titel
Big Data, 2013 IEEE International Conference on
Conference_Location
Silicon Valley, CA
Type
conf
DOI
10.1109/BigData.2013.6691703
Filename
6691703
Link To Document