Title :
HybDTM: a coordinated hardware-software approach for dynamic thermal management
Author :
Kumar, Amit ; Shang, Li ; Peh, Li-Shiuan ; Jha, Niraj K.
Author_Institution :
Dept. of Electr. Eng., Princeton Univ., NJ
Abstract :
With ever-increasing power density and cooling costs in modern high-performance systems, dynamic thermal management (DTM) has emerged as an effective technique for guaranteeing thermal safety at run-time. While past works on DTM have focused on different techniques in isolation, they fail to consider a synergistic mechanism using both hardware and software support and hence lead to a significant execution time overhead. In this paper, we propose HybDTM, a methodology for fine-grained, coordinated thermal management using a hybrid of hardware techniques, such as clock gating, and software techniques, such as thermal-aware process scheduling, synergistically leveraging the advantages of both approaches. We show that while hardware techniques can be used reactively to manage thermal emergencies, proactive use of low-overhead software techniques can rely on application-specific thermal profiles to lower system temperature. Our technique involves a novel regression-based thermal model which provides fast and accurate temperature estimates for run-time thermal characterization of applications running on the system, using hardware performance counters, while considering system-level thermal issues. We evaluate HybDTM on an actual desktop system running a number of SPEC2000 benchmarks, in both uniprocessor and simultaneous multithreading (SMT) environments, and show that it is able to successfully manage the overall temperature with an average execution time overhead of only 9.9% (16.3% maximum) compared to the case without any DTM, as opposed to 20.4% (29.5% maximum) overhead for purely hardware-based DTM
Keywords :
hardware-software codesign; regression analysis; thermal management (packaging); HybDTM method; SPEC2000 benchmarks; application-specific thermal profiles; coordinated hardware-software approach; coordinated thermal management; dynamic thermal management; hardware performance counters; hardware techniques; low-overhead software techniques; regression-based thermal model; run-time thermal characterization; Cooling; Costs; Disaster management; Energy management; Hardware; Power system management; Runtime; Safety; Temperature; Thermal management; Design; dynamic thermal management; hybrid hardware-software management; performance; thermal model;
Conference_Titel :
Design Automation Conference, 2006 43rd ACM/IEEE
Conference_Location :
San Francisco, CA
Print_ISBN :
1-59593-381-6
DOI :
10.1109/DAC.2006.229219