DocumentCode :
3686935
Title :
Monitoring Data Streams at Process Level in Scientific Big Data Batch Clusters
Author :
Eileen Kuehn;Max Fischer;Christopher Jung;Andreas Petzold;Achim Streit
Author_Institution :
Karlsruhe Inst. of Technol., Karlsruhe, Germany
fYear :
2014
Firstpage :
90
Lastpage :
95
Abstract :
The operation of scientific big data centres requires an overall monitoring and perception of system components. Insights into internal and external network traffic is of high importance for understanding specific data flows regarding storage accesses, firewall configurations, and the scheduling of batch jobs on clusters for computing/analysis of data. However, wide adoptions of federated storage, the handling of numerous job on many-core nodes, or the execution of job pilots inside the batch system complicate current data stream monitoring attempts. Therefore, the rising complexity requires new approaches to extend available solutions. As existing batch system monitoring and related system monitoring tools do not support measurements at batch job level, a new tool has been developed and put into operation at the Grid Ka data and computing centre at KIT for monitoring continuous data streams. Obtained results can for example be used to realise an optimisation of LAN/WAN setups based on measured data flows to adapt to the actual needs. This paper describes the current approach being implemented at the Grid Ka batch cluster and presents first analysis results showing the significance of measurements. The described approach is consecutively applied to the context of computing for high-energy physics.
Keywords :
"Monitoring","Data mining","Data models","Sockets","Large Hadron Collider","IP networks","Big data"
Publisher :
ieee
Conference_Titel :
Big Data Computing (BDC), 2014 IEEE/ACM International Symposium on
Type :
conf
DOI :
10.1109/BDC.2014.21
Filename :
7321733
Link To Document :
بازگشت