Title :
Maximizing Hardware Prefetch Effectiveness with Machine Learning
Author :
Saami Rahman;Martin Burtscher;Ziliang Zong;Apan Qasem
Author_Institution :
Dept. of Comput. Sci., Texas State Univ., San Marcos, TX, USA
Abstract :
Modern processors are equipped with multiple hardware prefetchers, each of which targets a distinct level in the memory hierarchy and employs a separate prefetching algorithm. However, different programs require different subsets of these prefetchers to maximize their performance. Turning on all available prefetchers rarely yields the best performance and, in some cases, prefetching even hurts performance. This paper studies the effect of hardware prefetching on multithreaded code and presents a machine-learning technique to predict the optimal combination of prefetchers for a given application. This technique is based on program characterization and utilizes hardware performance events in conjunction with a pruning algorithm to obtain a concise and expressive feature set. The resulting feature set is used in three different learning models. All necessary steps are implemented in a framework that reaches, on average, 96% of the best possible prefetcher speedup. The framework is built from open-source tools, making it easy to extend and port to other architectures.
Keywords :
"Prefetching","Hardware","Algorithms","Optimization","Testing","Training"
Conference_Titel :
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
DOI :
10.1109/HPCC-CSS-ICESS.2015.175