Title :
Instruction Cache Locking Using Temporal Reuse Profile
Author :
Yun Liang ; Mitra, Tulika ; Lei Ju
Author_Institution :
Center for Energy-Efficient Comput. & Applic., Peking Univ., Beijing, China
Abstract :
The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the performance of an application. Modern embedded processors often feature cache locking mechanisms that allow memory blocks to be locked in the cache under software control. Cache locking was primarily designed to offer timing predictability for hard real-time applications. Hence, prior techniques focus on employing cache locking to improve the worst-case execution time. However, cache locking can be quite effective in improving the average-case execution time of general embedded applications as well. In this paper, we explore static instruction cache locking to improve the average-case program performance. We introduce temporal reuse profile (TRP) to accurately and efficiently model the cost and benefit of locking memory blocks in the cache. We consider two locking mechanisms, line locking and way locking. For each locking mechanism, we propose a branch-and-bound algorithm and a heuristic approach that use the TRP to determine the most beneficial memory blocks to be locked in the cache. Experimental results show that the heuristic approach achieves close to the results of branch-and-bound algorithm and can improve the performance by 12% on average for 4 KB cache across a suite of real-world benchmarks. Moreover, our heuristic provides significant improvement compared to the state-of-the-art locking algorithm both in terms of performance and efficiency.
Keywords :
cache storage; embedded systems; tree searching; Instruction Cache Locking; TRP; average-case program performance; branch-and-bound algorithm; embedded system; heuristic approach; memory access latency; modern embedded processor; software control; static instruction cache locking; temporal reuse profile; Algorithm design and analysis; Embedded systems; Heuristic algorithms; Optimization; Program processors; Real-time systems; Timing; Cache; Cache Locking; Performance; Temporal Reuse Profile; cache locking; performance; temporal reuse profile (TRP);
Journal_Title :
Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on
DOI :
10.1109/TCAD.2015.2418320