DocumentCode :
3678335
Title :
IOSIG+: On the Role of I/O Tracing and Analysis for Hadoop Systems
Author :
Bo Feng;Xi Yang;Kun Feng;Yanlong Yin;Xian-He Sun
Author_Institution :
Dept. of Comput. Sci., Illinois Inst. of Technol., Chicago, IL, USA
fYear :
2015
Firstpage :
62
Lastpage :
65
Abstract :
Hadoop, as one of the most widely accepted MapReduce frameworks, is naturally data-intensive. Its several dependent projects, such as Mahout and Hive, inherent this characteristic. Meanwhile I/O optimization becomes a daunting work, since applications´ source code is not always available. I/O traces for Hadoop and its dependents are increasingly important, because it can faithfully reveal intrinsic I/O behaviors without knowing the source code. This method can not only help to diagnose system bottlenecks but also further optimize performance. To achieve this goal, we propose a transparent tracing and analysis tool suite, namely IOSIG+, which can be plugged into Hadoop system. We make several contributions: 1) we describe our approach of tracing, 2) we release the tracer, which can trace I/O operations without modifying targets´ source code, 3) this work adopts several techniques to mitigate the introduced execution overhead at runtime, 4) we create an analyzer, which helps to discover new approaches to address I/O problems according to access patterns. The experimental results and analysis confirm its effectiveness and the observed overhead can be as low as 1.97%.
Keywords :
"Java","Throughput","Optimization","Runtime","Tuning","Yarn","Performance evaluation"
Publisher :
ieee
Conference_Titel :
Cluster Computing (CLUSTER), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/CLUSTER.2015.17
Filename :
7307564
Link To Document :
بازگشت