• DocumentCode
    934384
  • Title

    A Rule-Based Verification and Control Framework in Atlas Trigger-DAQ

  • Author

    Kazarov, A. ; Corso-Radu, A. ; Miotto, G. Lehmann ; Sloper, J.E. ; Ryabov, Yu

  • Author_Institution
    CERN, Geneva
  • Volume
    54
  • Issue
    3
  • fYear
    2007
  • fDate
    6/1/2007 12:00:00 AM
  • Firstpage
    604
  • Lastpage
    608
  • Abstract
    In order to meet the requirements of ATLAS experiment data taking, the Trigger-DAQ (TDAQ) system is composed of O(10000) of applications running on more than 2600 computers in a network. With such a system size, software and hardware failures are quite frequent. To minimize system downtime, the Trigger-DAQ control system shall include advance verification and diagnostics facilities. The operator shall use tests and expertise of the TDAQ and detectors developers in order to diagnose and recover from errors, if possible automatically. The TDAQ control system is built as a distributed tree of controllers, where the behavior of each controller is defined in a rule-based language allowing easy customization. The control system also includes a verification framework which allows users to develop and configure tests for any component in the system with different levels of complexity. It can be used as a stand-alone test facility for a small detector installation, as part of the general TDAQ initialization procedure, and for diagnosing problems which may occur during run time. The system is currently being used in TDAQ commissioning at the ATLAS experimental zone and by subdetectors for stand-alone verification of the detector hardware before it is finally installed.
  • Keywords
    computational complexity; control engineering computing; distributed processing; high energy physics instrumentation computing; knowledge based systems; system recovery; ATLAS Trigger-DAQ; ATLAS experiment data taking; complexity; computer network; control framework; control system; detector hardware; diagnostics facilities; distributed controller tree; error recovery; hardware failures; rule-based verification; software failures; stand-alone test facility; system downtime; system size; Application software; Automatic control; Automatic testing; Computer networks; Control systems; Detectors; Distributed control; Hardware; Software systems; System testing; Artificial intelligence; command and control systems; control systems; data acquisition; diagnostic expert systems; distributed computing; distributed control; distributed information systems; expert system shells; expert systems; intelligent control; large-scale systems; process control; programmable control; system analysis and design;
  • fLanguage
    English
  • Journal_Title
    Nuclear Science, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9499
  • Type

    jour

  • DOI
    10.1109/TNS.2007.897825
  • Filename
    4237418