Title :
Trace-driven debugging of message passing programs
Author :
Frumkin, Michael ; Hood, Robert ; Lopez, Louis
Author_Institution :
NAS Syst. Div., NASA Ames Res. Center, Moffett Field, CA, USA
fDate :
30 Mar-3 Apr 1998
Abstract :
We report on features added to a parallel debugger to simplify the debugging of message passing programs. These features include replay, setting consistent breakpoints based on interprocess event causality, a parallel undo operation, and communication supervision. These features all use trace information collected during the execution of the program being debugged. We used a number of different instrumentation techniques to collect traces. We also implemented trace displays using two different trace visualization systems. The implementation was tested on an SGI Power Challenge cluster and a network of SGI workstations
Keywords :
local area networks; message passing; parallel programming; program debugging; SGI Power Challenge cluster; SGI workstation network; communication supervision; consistent breakpoints; instrumentation techniques; interprocess event causality; message passing program debugging; parallel undo operation; replay; trace displays; trace driven debugging; trace visualization systems; Debugging; History; Instruments; Libraries; Message passing; Monitoring; NASA; Space technology; Visualization; Workstations;
Conference_Titel :
Parallel Processing Symposium, 1998. IPPS/SPDP 1998. Proceedings of the First Merged International ... and Symposium on Parallel and Distributed Processing 1998
Conference_Location :
Orlando, FL
Print_ISBN :
0-8186-8404-6
DOI :
10.1109/IPPS.1998.670012