DocumentCode :
2235954
Title :
Fault Tolerance of Tornado Codes for Archival Storage
Author :
Woitaszek, Matthew ; Tufo, Henry M.
Author_Institution :
Colorado Univ., Boulder, CO
fYear :
0
fDate :
0-0 0
Firstpage :
83
Lastpage :
92
Abstract :
This paper examines a class of low density parity check (LDPC) erasure codes called Tornado codes for applications in archival storage systems. The fault tolerance of Tornado code graphs is analyzed and it is shown that it is possible to identify and mitigate worst-case failure scenarios in small (96 node) graphs through use of simulations to find and eliminate critical node sets that can cause Tornado codes to fail even when almost all blocks are present. The graph construction procedure resulting from the preceding analysis is then used to construct a 96-device Tornado code storage system with capacity overhead equivalent to RAID 10 that tolerates any 4 device failures. This system is demonstrated to be superior to other parity-based RAID systems. Finally, it is described how a geographically distributed data stewarding system can be enhanced by using cooperatively selected Tornado code graphs to obtain fault tolerance exceeding that of its constituent storage sites or site replication strategies
Keywords :
RAID; fault tolerant computing; graph theory; parity check codes; storage management; LDPC erasure code; Tornado code graph; archival storage system; distributed data stewarding system; fault tolerance; low density parity check; parity-based RAID system; Analytical models; Availability; Failure analysis; Fault diagnosis; Fault tolerance; Fault tolerant systems; Information retrieval; Parity check codes; Throughput; Tornadoes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Distributed Computing, 2006 15th IEEE International Symposium on
Conference_Location :
Paris
ISSN :
1082-8907
Print_ISBN :
1-4244-0307-3
Type :
conf
DOI :
10.1109/HPDC.2006.1652139
Filename :
1652139
Link To Document :
بازگشت