Title :
Balanced and Predictable Networked Storage
Author :
Kelley, Jaimie ; Stewart, Craig
Author_Institution :
Ohio State Univ., Columbus, OH, USA
Abstract :
Networking bandwidth and latency have improved in recent years, prompting a wide range of workloads to move back to key value stores, databases, and other types of networked storage. However, networked storage has a well known drawback: Outlier access times create a heavy tailed distribution. Outlier accesses can take much longer than normal access times. This paper studies the effects of outliers on data processing workloads. These workloads strive for balance, i.e., all nodes are kept busy at all times. Outlier accesses can cause bubbles in the pipeline, slowing down the whole workload. For this paper, we modeled the effect of outliers in balanced map reduce systems. We found that outliers can cause 70% slowdown. We also modeled a solution: Use 5% of system resources on replication for predictability -- an old but seldom used approach to mask outliers. We found that this approach could return more than 5% in speedup.
Keywords :
Big Data; statistical distributions; storage area networks; storage management; Big Data; balanced map reduce systems; balanced networked storage; data processing workloads; heavy tailed distribution; latency; mask outliers; networking bandwidth; outlier access times; predictable networked storage; Bandwidth; Data handling; Data processing; Data storage systems; Delays; Information management; Mathematical model; heavy tail distribution; map reduce; networked storage; replication for predictability;
Conference_Titel :
Distributed Computing Systems Workshops (ICDCSW), 2013 IEEE 33rd International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-1-4799-3247-4
DOI :
10.1109/ICDCSW.2013.50