Title :
Integration Experiences and Performance Studies of A COTS Parallel Archive System
Author :
Chen, Hsing-bung ; Grider, Gary ; Scott, Cody ; Turley, Milton ; Torres, Aaron ; Sanchez, Kathy ; Bremer, John
Author_Institution :
Los Alamos Nat. Lab., Los Alamos, NM, USA
Abstract :
Present and future Archive Storage Systems have been challenged to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have also been demanded to perform the same manner but at one or more orders of magnitude faster in performance. Archive systems continue to improve substantially comparable to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and less robust semantics. Currently, the number of extreme highly scalable parallel archive solutions is very limited especially for moving a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and an innovative software technology can bring new capabilities into a production environment for the HPC community. This solution is much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. We relay our experience of integrating a global parallel file system and a standard backup/archive product with an innovative parallel software code to construct a scalable and parallel archive storage system. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world´s first petaflop/s computing system, LANL´s Roadrunner machine, and de- - monstrated its capability to address requirements of future archival storage systems. Now this new Parallel Archive System is used on the LANL´s Turquoise Network.
Keywords :
meta data; parallel processing; software packages; storage management; very large databases; COTS components; COTS hardware; COTS parallel archive system; HPC community; LANL roadrunner machine; LANL turquoise network; archive storage systems; commercial-off-the-shelf hardware; hybrid storage approach; innovative parallel software code; innovative software technology; large striped parallel disk file; metadata performance; metadata searching speeds; parallel archive products; parallel archive software solution; parallel file systems; production environment; support policy-based hierarchical storage management capability; support standard interface; very large data sets; Bandwidth; File systems; Media; Runtime environment; Software systems; Archive Storage System; Cluster Computing; Hierarchical Storage Management; Parallel Archive; Parallel Data Movement; Parallel File System; Parallel I/O; Storage Hierarchy;
Conference_Titel :
Cluster Computing (CLUSTER), 2010 IEEE International Conference on
Conference_Location :
Heraklion, Crete
Print_ISBN :
978-1-4244-8373-0
Electronic_ISBN :
978-0-7695-4220-1
DOI :
10.1109/CLUSTER.2010.23