Title :
Reproducing non-deterministic bugs with lightweight recording in production environments
Author :
Wang, Nan ; Han, Jizhong ; Fu, Haiping ; He, Xubin ; Fang, Jinyun
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
Abstract :
Reproducing non-deterministic bugs is challenging. Recording program execution in production environments and reproducing bugs is an effective way to re-enable cyclic debugging. Unfortunately, most current record-replay approaches introduce large perturbations to either environments and/or execution flow, in addition to performance penalty and high storage overhead, which make them impracticable to be deployed in production environments. This paper presents Snitchaser - a fully user-space record-replay tool which can faithfully reproduce bugs by replaying system calls which are recorded with negligible perturbation and recording overhead. This is achieved by 1) a novel, lightweight system call interception mechanism without patching the binary instructions to reduce the perturbation to execution flow; 2) system call latch to save signal semantic; 3) periodic checkpointing to reduce the storage overhead. Snitchaser focuses on bugs caused by asynchronous events on heavily loaded, high throughput servers. Experimental results show that Snitchaser is capable of reproducing non-deterministic bugs efficiently at nearly no performance penalty. We also present two case studies on dealing with existing bugs in Lighttpd - a popular software used in many large scale systems.
Keywords :
checkpointing; product development; program debugging; Lighttpd; Snitchaser; cyclic debugging; execution flow; lightweight recording; lightweight system call interception mechanism; nondeterministic bug reproduction; performance penalty; periodic checkpointing; production environments; program execution recording; system call latch; user-space record-replay tool; Checkpointing; Computer bugs; Debugging; Hardware; Production; Semantics; Throughput;
Conference_Titel :
Performance Computing and Communications Conference (IPCCC), 2010 IEEE 29th International
Conference_Location :
Albuquerque, NM
Print_ISBN :
978-1-4244-9330-2
DOI :
10.1109/PCCC.2010.5682332