Title :
Fault-tolerant distributed mass storage for LHC computing
Author :
Wiebalck, Arne ; Breuer, Peter T. ; Lindenstruth, Volker ; Stinbeck, T.M.
Author_Institution :
Kirchhoff Inst. for Phys., Univ. of Heidelberg, Germany
Abstract :
In this paper we present the concept and first prototyping results of a modular fault-tolerant distributed mass storage architecture for large Linux PC clusters as they are deployed by the upcoming particle physics experiments. The device masquerading technique using an Enhanced Network Block Device (ENBD) enables local RAID over remote disks as the key concept of the ClusterRAID system. The block level interface to remote files, partitions or disks provided by the ENBD makes it possible to use the standard Linux software RAID to add fault-tolerance to the system. Preliminary performance measurements indicate that the latency is comparable to a local hard drive. With four disks throughput rates of up to 55MB/s were achieved with first prototypes for a RAIDO setup, and about 40M/s for a RAID5 setup.
Keywords :
RAID; distributed processing; fault tolerant computing; workstation clusters; ClusterRAlD system; Enhanced Network Block Device; LHC computing; Linux PC clusters; RAID; device masquerading technique; fault-tolerant distributed mass storage; Computer architecture; Delay; Distributed computing; Fault tolerance; Fault tolerant systems; Large Hadron Collider; Linux; Measurement; Prototypes; Software standards;
Conference_Titel :
Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd IEEE/ACM International Symposium on
Print_ISBN :
0-7695-1919-9
DOI :
10.1109/CCGRID.2003.1199377