DocumentCode
1897731
Title
An Analytical Framework and Its Applications for Studying Brick Storage Reliability
Author
Chen, Ming ; Chen, Wei ; Liu, Likun ; Zhang, Zheng
Author_Institution
Microsoft Res. Asia, Beijing
fYear
2007
fDate
10-12 Oct. 2007
Firstpage
242
Lastpage
252
Abstract
The reliability of a large-scale storage system is influenced by a complex set of inter-dependent factors. This paper presents a comprehensive and extensible analytical framework that offers quantitative answers to many design tradeoffs. We apply the framework to a number of important design strategies that a designer and/or administrator must face in reality, including topology-aware replica placement, proactive replication that uses small background network bandwidth and unused disk space to create additional copies. We also quantify the impact of slow (but potentially more accurate) failure detection and lazy replacement of failed disks. We use detailed simulation to verify and refine our analytical model. These results demonstrate the versatility of the framework and serve as a solid step towards more quantitative studies of fundamental system tradeoffs between reliability, performance, and cost in large-scale distributed storage systems.
Keywords
software reliability; storage area networks; system recovery; background network bandwidth; brick storage reliability; failure detection; interdependent factors; large-scale distributed storage system reliability; proactive replication; topology-aware replica placement; Analytical models; Bandwidth; Costs; Delay; Image storage; Large-scale systems; Maintenance; Network topology; Reliability; Switches;
fLanguage
English
Publisher
ieee
Conference_Titel
Reliable Distributed Systems, 2007. SRDS 2007. 26th IEEE International Symposium on
Conference_Location
Beijing
ISSN
1060-9857
Print_ISBN
0-7695-2995-X
Type
conf
DOI
10.1109/SRDS.2007.21
Filename
4365700
Link To Document