DocumentCode
3693652
Title
dsReliM: Power-constrained reliability management in Dark-Silicon many-core chips under process variations
Author
Mohammad Salehi;Muhammad Shafique;Florian Kriebel;Semeen Rehman;Mohammad Khavari Tavana;Alireza Ejlali;Jörg Henkel
Author_Institution
ESRLab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
fYear
2015
Firstpage
75
Lastpage
82
Abstract
Due to the tight power envelope, in the future technology nodes it is envisaged that not all cores in a many-core chip can be simultaneously powered-on (at full performance level). The power-gated cores are referred to as Dark Silicon. At the same time, growing reliability issues due to process variations and soft errors challenge the cost-effective deployment of future technology nodes. This paper presents a reliability management system for Dark Silicon chips (dsReliM) that optimizes for reliability of on-chip systems while jointly accounting for soft errors, process variations and the thermal design power (TDP) constraint. Towards the TDP-constrained reliability optimization, dsReliM leverages multiple reliable application versions that can potentially execute on different cores with frequency variations and supporting differenst voltage-frequency levels, thus facilitating distinct power, reliability and performance tradeoffs at run time. Experiments show that our dsReliM system provides up to 20% reliability improvements under different TDP constraints when compared to a state-of-the-art technique. Also, compared to an ideal-case optimal solution, dsReliM deviates up to 2.5% in terms of reliability efficiency, but speeds up the reliability management decision time by a factor of up to 3100.
Keywords
"Software reliability","Software","Silicon","Timing","Management","Transient analysis"
Publisher
ieee
Conference_Titel
Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2015 International Conference on
Type
conf
DOI
10.1109/CODESISSS.2015.7331370
Filename
7331370
Link To Document