Title :
Impact of failure on interconnection networks for large storage systems
Author :
Xin, Qin ; Miller, Ethan L. ; Schwarz, S. J. Thomas J. E. ; Long, Darrell D E
Author_Institution :
Storage Syst. Res. Center, California Univ., Santa Cruz, CA, USA
Abstract :
Recent advances in large-capacity, low-cost storage devices have led to active research in design of large-scale storage systems built from commodity devices for supercomputing applications. Such storage systems, composed of thousands of storage devices, are required to provide high system bandwidth and petabyte-scale data storage. A robust network interconnection is essential to achieve high bandwidth, low latency, and reliable delivery during data transfers. However, failures, such as temporary link outages and node crashes, are inevitable. We discuss the impact of potential failures on network interconnections in very large-scale storage systems and analyze the trade-offs among several storage network topologies by simulations. Our results suggest that a good interconnect topology be essential to fault-tolerance of a petabyte-scale storage system.
Keywords :
fault tolerant computing; multiprocessor interconnection networks; storage area networks; storage management; system recovery; fault-tolerance computing; interconnection network; large storage system; petabyte-scale data storage; storage device; storage network topology; Analytical models; Bandwidth; Computer crashes; Delay; Failure analysis; Large-scale systems; Memory; Multiprocessor interconnection networks; Network topology; Robustness;
Conference_Titel :
Mass Storage Systems and Technologies, 2005. Proceedings. 22nd IEEE / 13th NASA Goddard Conference on
Print_ISBN :
0-7695-2318-8
DOI :
10.1109/MSST.2005.18