DocumentCode :
2955520
Title :
Real time monitoring system for applications performing the population of condition databases for CMS non-event data
Author :
Cavallari, Francesca ; De Gruttola, Michele ; Guida, Salvatore Di ; Govi, Giacomo ; Innocente, Vincenzo ; Mazrimas, Gediminas ; Paolucci, Pierluigi ; Pierro, Antonio
Author_Institution :
CERN, Geneva, Switzerland
fYear :
2010
fDate :
24-28 May 2010
Firstpage :
1
Lastpage :
8
Abstract :
In real time systems, such as CMS Online Condition Database, monitoring and fast detecting errors is a very challenging task. To recover the system and to put it in a safe state requires spotting a faulty situation with strict timing constraints and a fast reaction. In the context of real time monitoring, this implies that the interval of time from when a faulty event occurs to when it is noticed must be minimized. In the CMS experiment, many users have access to condition data with different needs: they exploit several applications with different roles. Therefore, the system that monitors the online condition database must describe the status of the infrastructure according to the different categories of users. For example, when no errors occur, it provides simple timing information or the history of all transactions towards all the database accounts; instead, in case of faulty situations, it returns simple error messages or more complete debugging information. Hence, to classify correctly an error, once observed, the monitoring system must describe both the timing aspects of the applications that populate the Online Condition Database schemas, and the complex relationship between the components of the heterogeneous software environment. In the first part of this paper, we define the expected behaviour of the handling, the storage and the retrieval of condition data for the CMS experiment. In the second part, we describe the software components used in order to determine system failures. Finally, we will present the monitoring system used to visualize the status of the condition data infrastructure, eventually spotting all the possible combinations of error states, and how these views can be customized on the basis of the different categories of users.
Keywords :
computer debugging; database management systems; information retrieval; monitoring; real-time systems; CMS non-event data; CMS online condition database; debugging information; real time monitoring system; software components; Calibration; Databases; Detectors; History; Monitoring; Servers; Software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Real Time Conference (RT), 2010 17th IEEE-NPSS
Conference_Location :
Lisbon
Print_ISBN :
978-1-4244-7108-9
Type :
conf
DOI :
10.1109/RTC.2010.5750460
Filename :
5750460
Link To Document :
بازگشت