DocumentCode
3565705
Title
Failure Analysis in a highly parallel processor for Ll Triggering
Author
Cancelo, G. ; Gottschalk, E. ; Pavlicek, V. ; Wang, M. ; Wu, J.
Author_Institution
Fermi Nat. Accel. Lab., Batavia, IL, USA
Volume
2
fYear
2003
Firstpage
1283
Abstract
The current paper studies how processor failures affect the dataflow of the Level I Trigger in the BTeV experiment proposed to run at Fermilab´s Tevatron. The failure analysis is crucial for a system with over 2500 processing nodes and a number of storage units and communication links of the same order of magnitude. The failure analysis is based on models of the L1 Trigger architecture and shows the dynamics of the architecture´s dataflow. The failure analysis provides insight into how system variables are affected by single component failures and provides key information to the implementation Of error recovery strategies. The analysis includes both short term failures from which the system can recover quickly and long term failures which imply a more drastic error recovery strategy. The modeling results are supported by behavioral simulations of the L1 Trigger processing BTeV´s Geant Monte Carlo data.
Keywords
failure analysis; nuclear electronics; parallel processing; trigger circuits; BTeV experiment; Geant Monte Carlo data; Ll Triggering; failure analysis; highly parallel processor; Data analysis; Detectors; Failure analysis; Fault tolerant systems; Hardware; Laboratories; Mesons; Queueing analysis; Steady-state; Switches;
fLanguage
English
Publisher
ieee
Conference_Titel
Nuclear Science Symposium Conference Record, 2003 IEEE
ISSN
1082-3654
Print_ISBN
0-7803-8257-9
Type
conf
DOI
10.1109/NSSMIC.2003.1351928
Filename
1351928
Link To Document