DocumentCode :
3385273
Title :
Toward understanding soft faults in high performance cluster networks
Author :
Evans, Jeffrey J. ; Baik, Seongbok ; Hood, Cynthia S. ; Gropp, William
Author_Institution :
Dept. of Comput. Sci., Illinois Inst. of Technol., Chicago, IL, USA
fYear :
2003
fDate :
24-28 March 2003
Firstpage :
117
Lastpage :
120
Abstract :
Fault management in high performance cluster networks has been focused on the notion of hard faults (i.e., link or node failures). Network degradations that negatively impact performance but do not result in failures often go unnoticed. In this paper, we classify such degradations as soft faults. In addition, we identify consistent performance as an important requirement in cluster networks. Using this service requirement, we describe a comprehensive strategy for cluster fault management.
Keywords :
computer network management; performance evaluation; workstation clusters; cluster fault management; consistent performance; high performance cluster networks; network degradations; soft faults; Computer network management; Computer science; Degradation; Environmental management; Grid computing; Intelligent networks; Kernel; Laboratories; Processor scheduling; Technology management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Integrated Network Management, 2003. IFIP/IEEE Eighth International Symposium on
Print_ISBN :
1-4020-7418-2
Type :
conf
DOI :
10.1109/INM.2003.1194169
Filename :
1194169
Link To Document :
بازگشت