DocumentCode :
244808
Title :
Development of a farm-oriented benchmark tool for distributed filesystem
Author :
Favaro, Matteo ; Ricci, Pier Paolo ; Gregori, Daniele
Author_Institution :
CNAF, INFN, Bologna, Italy
fYear :
2014
fDate :
21-25 July 2014
Firstpage :
1023
Lastpage :
1029
Abstract :
The INFN CNAF Tier1 is the main computing center of Italian Institute of Nuclear Physics. Here there are several teams that cooperate with each other in order to grant proper operation of the entire center. In our organization these groups are: the farming, the storage, the network and the infrastructure departments. In each teams one of the most important necessity is to measure the performance of the hardware in use. In addition, defining what performance measurements are significant in the context and how to measure them is not an obvious task. Our work is focused on the storage devices and the goal of this project is to understand how we can measure the performance of a filesystem with a real production data access pattern in order to compare different hardware solution for the production environment. Currently there are other tools that can help to measure the throughput through bandwidth or the Input/Output Operation per second (iops) of a filesystem, but they cant simulate a real production environment data access pattern. Effectively these tools work only with limited number of concurrent multiple processes or they haven´t enough flexibility to simulate a real pattern. Furthermore, they must be synchronized via barriers (i.e. software synchronization method) or via ad hoc library as MPI. The tool presented in this paper can simulate a real physical analysis job data access pattern starting from a real job and replicate the simulation without heavy synchronization between nodes or without a heavy environment set up. The data are collected from each node and lossless stored into a remote database. It, also, include the possibility to indicate opportune tuning parameters for better suiting to several scenarios, e.g. it is possible to decide the sampling times during the test or the duration that a test can take in order to use a scheduled time windows arrange with other team. It is also possible to represent the data in a graphic way, using the appropriate an- lysis and visualization program. In addition, the data could be processed in various ways and the sampling times could be aggregated for a better visualization. Initially the tool has been developed for measuring GPFS and NFS filesystem but it has a modular implementation and therefore it is ready to analyze other types of filesystem simply developing the correct module for each.
Keywords :
distributed databases; program visualisation; storage management; GPFS filesystem; INFN CNAF Tier1; Italian Institute of Nuclear Physics; MPI; NFS filesystem; ad hoc library; computing center; distributed filesystem; farm-oriented benchmark tool development; farming departments; hardware performance measurements; hardware solution; infrastructure departments; input/output operation; network departments; production environment; real physical analysis job data access pattern; real production data access pattern; software synchronization method; storage departments; storage devices; throughput measurement; tuning parameters; visualization program; Bandwidth; Benchmark testing; Data models; Databases; Hardware; Monitoring; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing & Simulation (HPCS), 2014 International Conference on
Conference_Location :
Bologna
Print_ISBN :
978-1-4799-5312-7
Type :
conf
DOI :
10.1109/HPCSim.2014.6903806
Filename :
6903806
Link To Document :
بازگشت