DocumentCode
651608
Title
Balanced and Predictable Networked Storage
Author
Kelley, Jaimie ; Stewart, Craig
Author_Institution
Ohio State Univ., Columbus, OH, USA
fYear
2013
fDate
8-11 July 2013
Firstpage
202
Lastpage
207
Abstract
Networking bandwidth and latency have improved in recent years, prompting a wide range of workloads to move back to key value stores, databases, and other types of networked storage. However, networked storage has a well known drawback: Outlier access times create a heavy tailed distribution. Outlier accesses can take much longer than normal access times. This paper studies the effects of outliers on data processing workloads. These workloads strive for balance, i.e., all nodes are kept busy at all times. Outlier accesses can cause bubbles in the pipeline, slowing down the whole workload. For this paper, we modeled the effect of outliers in balanced map reduce systems. We found that outliers can cause 70% slowdown. We also modeled a solution: Use 5% of system resources on replication for predictability -- an old but seldom used approach to mask outliers. We found that this approach could return more than 5% in speedup.
Keywords
Big Data; statistical distributions; storage area networks; storage management; Big Data; balanced map reduce systems; balanced networked storage; data processing workloads; heavy tailed distribution; latency; mask outliers; networking bandwidth; outlier access times; predictable networked storage; Bandwidth; Data handling; Data processing; Data storage systems; Delays; Information management; Mathematical model; heavy tail distribution; map reduce; networked storage; replication for predictability;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems Workshops (ICDCSW), 2013 IEEE 33rd International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
978-1-4799-3247-4
Type
conf
DOI
10.1109/ICDCSW.2013.50
Filename
6679888
Link To Document