DocumentCode
685913
Title
Understanding the Impacts of Solid-State Storage on the Hadoop Performance
Author
Dan Wu ; Wenhai Luo ; Wenyan Xie ; Xiaoheng Ji ; Jian He ; Di Wu
Author_Institution
Network & Inf. Branch, Guangdong Electr. Power Design Inst., Guangzhou, China
fYear
2013
fDate
13-15 Dec. 2013
Firstpage
125
Lastpage
130
Abstract
The superior I/O performance of solid-state storage (e.g., solid-state drives) makes it become an attractive replacement for the traditional magnetic storage (e.g., harddisk drives). More and more storage systems start to integrate solid-state storage into their architecture. To understand the impacts of solid-state storage on the performance of Hadoop applications, we consider a hybrid Hadoop storage system consisting of both HDDs and SSDs, and conduct a series of experiments to evaluate the Hadoop performance under various system configurations. We find that the Hadoop performance can be increased almost linearly with the increasing fraction of SSDs in the storage system. The improvement is more significant for a larger dataset size. In addition, the performance of Hadoop applications running on SSD-dominant storage systems is insensitive to the variations of block size and buffer size, which significantly differs from HDD-dominant storage systems. By increasing the fraction of SSDs, there is no need for the Hadoop operators to consider how to carefully tune block size and buffer size to achieve the optimal performance. Our findings also indicate that the upgrade of the hadoop storage system can be achieved by increasing the capacity of SSDs linearly according to the scale of the applications.
Keywords
buffer storage; public domain software; software performance evaluation; Hadoop applications; Hadoop performance; SSD-dominant storage systems; block size; buffer size; hybrid Hadoop storage system; solid-state storage; superior I/O performance; Benchmark testing; Buffer storage; Computer architecture; Merging; Performance evaluation; Servers; Sorting; Hadoop configuration; Hadoop performance; Hybrid storage system; Solid-state storage;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Cloud and Big Data (CBD), 2013 International Conference on
Conference_Location
Nanjing
Print_ISBN
978-1-4799-3260-3
Type
conf
DOI
10.1109/CBD.2013.39
Filename
6824584
Link To Document