Title :
On performance debugging of unnecessary lock contentions on multicore processors: A replay-based approach
Author :
Long Zheng ; Xiaofei Liao ; Bingsheng He ; Song Wu ; Hai Jin
Author_Institution :
Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
Locks have been widely used as an effective synchronization mechanism among processes and threads. However, we observe that a large number of false inter-thread dependencies (i.e., unnecessary lock contentions) exist during the program execution on multicore processors, thereby incurring significant performance overhead. This paper presents a performance debugging framework, PerfPlay, to facilitate a comprehensive and in-depth understanding of the performance impact of unnecessary lock contentions. The core technique of our debugging framework is trace replay. Specifically, PerfPlay records the program execution trace, on the basis of which the unnecessary lock contentions can be identified through trace analysis. We then propose a novel technique of trace transformation to transform these identified unnecessary lock contentions in the original trace into the correct pattern as a new trace free of unnecessary lock contentions. Through replaying both traces, PerfPlay can quantify the performance impact of unnecessary lock contentions. To demonstrate the effectiveness of our debugging framework, we study five real-world programs and PARSEC benchmarks. Our experimental results demonstrate the significant performance overhead of unnecessary lock contentions, and the effectiveness of PerfPlay in identifying the performance critical unnecessary lock contentions in real applications.
Keywords :
multiprocessing systems; program debugging; program diagnostics; synchronisation; PARSEC benchmarks; PerfPlay; interthread dependencies; multicore processors; performance debugging; program execution trace; replay-based approach; synchronization mechanism; trace analysis; trace transformation; unnecessary lock contentions; Debugging; Educational institutions; Instruction sets; Multicore processing; Semantics; Synchronization; Topology;
Conference_Titel :
Code Generation and Optimization (CGO), 2015 IEEE/ACM International Symposium on
Conference_Location :
San Francisco, CA
DOI :
10.1109/CGO.2015.7054187