DocumentCode :
3325484
Title :
Use of late-binding technology for workload management system in CMS
Author :
Padhi, Sanjay ; Pi, Haifeng ; Sfiligoi, Igor ; Wuerthwein, Frank
Author_Institution :
Univ. of California, San Diego, CA, USA
fYear :
2009
fDate :
Oct. 24 2009-Nov. 1 2009
Firstpage :
512
Lastpage :
515
Abstract :
Condor glidein-based workload management system (glideinWMS) has been developed and integrated with distributed physics analysis and Monte Carlo (MC) production system at Compact Muon Solenoid (CMS) experiment. The late-binding between the jobs and computing element (CE), and the validation of WorkerNode (WN) environment help significantly reduce the failure rate of Grid jobs. For CPU-consuming MC data production, opportunistic grid resources can be effectively explored via the extended computing pool built on top of various heterogeneous Grid resources. The Virtual Organization (VO) policy is embedded into the glideinWMS and pilot job configuration. GSI authentication, authorization and interfacing with gLExec allows a large user basis to be supported and seamlessly integrated with Grid computing infrastructure. The operation of glideinWMS at CMS shows that it is a highly available and stable system for a large VO of thousands of users and running tens of thousands of user jobs simultaneously. The enhanced monitoring allows system administrators and users to easily track the system-level and job-level status.
Keywords :
Monte Carlo methods; grid computing; high energy physics instrumentation computing; virtual enterprises; CPU-consuming MC data production; Condor glidein-based workload management; GSI authentication; Grid computing infrastructure; Monte Carlo production system; compact muon solenoid experiment; computing element; computing pool; distributed physics analysis; failure rate; gLExec; glidein-based workload management system; glideinWMS; grid jobs; heterogeneous Grid resources; late-binding technology; opportunistic Grid resources; pilot job configuration; virtual organization policy; workernode environment; workload management system; Authentication; Authorization; Collision mitigation; Grid computing; Mesons; Monte Carlo methods; Physics; Production systems; Solenoids; Technology management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Nuclear Science Symposium Conference Record (NSS/MIC), 2009 IEEE
Conference_Location :
Orlando, FL
ISSN :
1095-7863
Print_ISBN :
978-1-4244-3961-4
Electronic_ISBN :
1095-7863
Type :
conf
DOI :
10.1109/NSSMIC.2009.5401636
Filename :
5401636
Link To Document :
بازگشت