• DocumentCode
    659554
  • Title

    NativeTask: A Hadoop compatible framework for high performance

  • Author

    Dong Yang ; Xiang Zhong ; Dong Yan ; Fangqin Dai ; Xusen Yin ; Cheng Lian ; Zhongliang Zhu ; Weihua Jiang ; Gansha Wu

  • Author_Institution
    Intel Corp., Beijing, China
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    94
  • Lastpage
    101
  • Abstract
    Although Hadoop MapReduce provides good programming abstractions and horizontal scalability, it is often blamed for its poor single node performance. In the meantime, MapReduce has already achieved a large install base, thus any performance improvement should keep the compatibility. In this paper, we address the challenges via several approaches guided by low-level performance analysis. And we materialize the approaches via NativeTask, a high-performance, fully compatible MapReduce execution engine. We evaluate its performance with representative HiBench workloads. The results show that the speedup NativeTask achieves ranges from 10% to 160%, and it paves the way for a better MapReduce that excels on both single node performance and scalability. In the future, hardware acceleration can also be applied to further improve the system´s efficiency.
  • Keywords
    distributed processing; software performance evaluation; Hadoop MapReduce; Hadoop compatible framework; NativeTask; hardware acceleration; high performance; horizontal scalability; low-level performance analysis; programming abstractions; representative HiBench workloads; system efficiency; Data processing; Engines; Java; Libraries; Optimization; Random access memory; Sorting; C++ implementation; CPU-bound application; Hadoop; cache-oblivious sort; compatibility; high performance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691703
  • Filename
    6691703