Title :
Automatic Locality Exploitation in the Codelet Model
Author :
Chen Chen ; Yao Wu ; Suetterlein, Joshua ; Long Zheng ; Minyi Guo ; Gao, Guang R.
Author_Institution :
Univ. of Delaware, Newark, DE, USA
Abstract :
State-of-the-art codelet scheduling focuses on dynamic workload balance of codelets (similar to tasks). While this approach may achieve reasonable performance since computation resources are fully utilized, it may not attain optimal energy savings. In this paper, targeting at IBM Cyclops64 -- a manycore system, we propose a novel polynomial time algorithm that finds out the optimal codelet scheduling in terms of maximum locality and minimum global memory accesses. Our algorithm leverages static information regarding locality among codelets to achieve better performance and energy efficiency. By using local buffers to pass data produced in one codelet to another, global memory accesses can be greatly reduced. The experimental results on our developed IBM Cyclops-64 emulator show that the codelet scheduling of our algorithm removes up to 59.7% of global memory accesses, achieves up to 68.1% of performance improvement, and reduces up to 40.7% of energy consumption comparing to the state-of-the-art codelet scheduling.
Keywords :
computational complexity; multiprocessing systems; scheduling; IBM Cyclops-64 emulator; IBM Cyclops64 manycore system; automatic locality exploitation; codelet dynamic workload balance; codelet scheduling model; energy consumption; energy efficiency; global memory access; polynomial time algorithm; static information; Computer architecture; Heuristic algorithms; Optimal scheduling; Partitioning algorithms; Processor scheduling; Schedules; Scheduling; codelet; execution model; fine-grain; locality;
Conference_Titel :
Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on
Conference_Location :
Melbourne, VIC
DOI :
10.1109/TrustCom.2013.104