Title :
Energy-effectiveness of pre-execution and energy-aware p-thread selection
Author :
Petric, Vlad ; Roth, Amir
Author_Institution :
Dept. of Comput. & Inf. Sci., Pennsylvania Univ., Philadelphia, PA, USA
Abstract :
Pre-execution removes the microarchitectural latency of "problem" loads from a program\´s critical path by redundantly executing copies of their computations in parallel with the main program. There have been several proposed pre-execution systems, a quantitative framework (PTHSEL) for analytical pre-execution thread (p-thread) selection, and even a research prototype. To date, however, the energy aspects of pre-execution have not been studied. Cycle-level performance and energy simulations on SPEC2000 integer benchmarks that suffer from L2 misses show that energy-blind pre-execution naturally has a linear latency/energy trade-off, improving performance by 13.8% while increasing energy consumption by 11.9%. To improve this trade-off, we propose two extensions to PTHSEL. First, we replace the flat cycle-for-cycle load cost model with a model based on a critical-path estimation. This extension increases p-thread efficiency in an energy-independent way. Second, we add a parameterized energy model to PTHSEL (forming PTHSEL+E) that allows it to actively select p-threads that reduce energy rather than (or in combination with) execution latency. Experiments show that PTHSEL+E manipulates pre-execution\´s latency/energy more effectively. Latency targeted selection benefits from the improved load cost model: its performance improvements grow to an average of 16.4% while energy costs drop to 8.7%. ED targeted selection produces p-threads that improve performance by only 12.9%, but ED by 8.8%. Targeting p-thread selection for energy reduction, results in "energy-free" pre-execution, with average speedup of 5.4%, and a small decrease in total energy consumption (0.7%).
Keywords :
microprogramming; performance evaluation; SPEC2000 integer benchmarks; critical-path estimation; energy consumption; energy trade-off; energy-aware p-thread selection; energy-blind pre-execution; execution latency; flat cycle-for-cycle load cost model; latency targeted selection; linear latency; microarchitectural latency; parameterized energy model; pre-execution p-thread selection; Computer aided instruction; Computer architecture; Delay; Energy consumption; Hardware; Information science; Multithreading; Pareto optimization; Prefetching; Yarn;
Conference_Titel :
Computer Architecture, 2005. ISCA '05. Proceedings. 32nd International Symposium on
Print_ISBN :
0-7695-2270-X
DOI :
10.1109/ISCA.2005.27