DocumentCode :
3134639
Title :
DataMover: robust terabyte-scale multi-file replication over wide-area networks
Author :
Sim, Alex ; Gu, Junmin ; Shoshani, Arie ; Natarajan, Vijaya
Author_Institution :
Lawrence Berkeley Nat. Lab., CA, USA
fYear :
2004
fDate :
21-23 June 2004
Firstpage :
403
Lastpage :
412
Abstract :
Typically, large scientific datasets (order of terabytes) are generated at large computational centers, and stored on mass storage systems. However, large subsets of the data need to be moved to facilities available to application scientists for analysis. File replication of thousands of files is a tedious, error prone, but extremely important task in scientific applications. The automation of the file replication task requires automatic space acquisition and reuse, and monitoring the progress of staging thousands of files from the source mass storage system, transferring them over the network, archiving them at the target mass storage system or disk systems, and recovering from transient system failures. We have developed a robust replication system, called DataMover, which is now in regular use in High-Energy-Physics and Climate modeling experiments. Only a single command is necessary to request multi-file replication or the replication of an entire directory. A Web-based tool was developed to dynamically monitor the progress of the multi-file replication process.
Keywords :
Internet; data acquisition; data handling; high energy physics instrumentation computing; storage management; very large databases; wide area networks; DataMover; Web-based tool; application scientists; automatic space acquisition; automatic space reuse; climate modeling experiments; data transfer; disk systems; dynamic monitoring; file replication; high-energy-physics experiments; large data movement; large scientific datasets; mass storage systems; progress monitoring; robust replication system; terabyte-scale multifile replication; transient system failure recovery; wide-area networks; Computer networks; Computerized monitoring; Condition monitoring; Data analysis; Energy resolution; Laboratories; Robots; Robustness; Storage automation; Supercomputers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on
ISSN :
1099-3371
Print_ISBN :
0-7695-2146-0
Type :
conf
DOI :
10.1109/SSDM.2004.1311236
Filename :
1311236
Link To Document :
بازگشت