• DocumentCode
    3565705
  • Title

    Failure Analysis in a highly parallel processor for Ll Triggering

  • Author

    Cancelo, G. ; Gottschalk, E. ; Pavlicek, V. ; Wang, M. ; Wu, J.

  • Author_Institution
    Fermi Nat. Accel. Lab., Batavia, IL, USA
  • Volume
    2
  • fYear
    2003
  • Firstpage
    1283
  • Abstract
    The current paper studies how processor failures affect the dataflow of the Level I Trigger in the BTeV experiment proposed to run at Fermilab´s Tevatron. The failure analysis is crucial for a system with over 2500 processing nodes and a number of storage units and communication links of the same order of magnitude. The failure analysis is based on models of the L1 Trigger architecture and shows the dynamics of the architecture´s dataflow. The failure analysis provides insight into how system variables are affected by single component failures and provides key information to the implementation Of error recovery strategies. The analysis includes both short term failures from which the system can recover quickly and long term failures which imply a more drastic error recovery strategy. The modeling results are supported by behavioral simulations of the L1 Trigger processing BTeV´s Geant Monte Carlo data.
  • Keywords
    failure analysis; nuclear electronics; parallel processing; trigger circuits; BTeV experiment; Geant Monte Carlo data; Ll Triggering; failure analysis; highly parallel processor; Data analysis; Detectors; Failure analysis; Fault tolerant systems; Hardware; Laboratories; Mesons; Queueing analysis; Steady-state; Switches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Nuclear Science Symposium Conference Record, 2003 IEEE
  • ISSN
    1082-3654
  • Print_ISBN
    0-7803-8257-9
  • Type

    conf

  • DOI
    10.1109/NSSMIC.2003.1351928
  • Filename
    1351928