Title :
Improved thermal management with reliability banking
Author :
Lu, Zhijian ; Lach, John ; Stan, Mircea R. ; Skadron, Kevin
Author_Institution :
Charles L. Brown Dept. of Electr. & Comput. Eng., Virginia Univ., Charlottesville, VA, USA
Abstract :
Using a fixed temperature for thermal throttling is pessimistic. Reduced aging during periods of low temperature can compensate for accelerated aging during periods of high temperature. Runtime tracking of the temperature-dependent aging rate means that throttling is engaged only when necessary to maintain reliability. In this article, we show that the effect of cool (low-temperature) phases can compensate for that of hot (high-temperature) phases on reliability. Existing dynamic thermal management (DTM) techniques ignore the effects of temperature fluctuations on chip lifetime and can unnecessarily impose performance penalties for hot phases. Using electromigration as the targeted failure mechanism, we apply a dynamic reliability model and propose a dynamic reliability management (DRM) technique to dynamically track the consumption of chip lifetime during operation.
Keywords :
electromigration; integrated circuit design; integrated circuit reliability; microprocessor chips; chip lifetime; dynamic reliability management; dynamic thermal management; electromigration; reliability banking; temperature fluctuations; temperature-dependent aging rate; Accelerated aging; Banking; Electromigration; Failure analysis; Fluctuations; Maintenance; Runtime; Target tracking; Temperature; Thermal management; Analytical and simulation techniques; Dynamic thermal/reliability management; Electromigration; Modeling; Performability;
Journal_Title :
Micro, IEEE
DOI :
10.1109/MM.2005.114