Title :
Automated job monitoring in a high performance computing environment
Author :
Cromp, Robert F. ; Suberri, Gilad
Author_Institution :
Raytheon Co., Waltham, MA, USA
Abstract :
We are developing software that monitors high performance computing assets while users´ batch jobs execute, and actively performs site-established corrective actions to handle routine system/queuing issues normally performed by Unix administrators. The automated job monitor is independent of both platform and queueing system, and is customizable for numerous domains.
Keywords :
graphical user interfaces; monitoring; programming environments; queueing theory; software engineering; Unix administrators; automated job monitoring; high performance computing environment; queueing system; routine system; site-established corrective actions; software development; Computational efficiency; Computerized monitoring; Engines; Frequency; High performance computing; Processor scheduling; Production systems; Resource management; Retirement; Software performance;
Conference_Titel :
Autonomic Computing, 2004. Proceedings. International Conference on
Print_ISBN :
0-7695-2114-2
DOI :
10.1109/ICAC.2004.1301384