Title :
A highly efficient, thread-safe software cache implementation for tightly-coupled multicore clusters
Author :
Pinto, Claudio ; Benini, Luca
Author_Institution :
DEI Dept., Univ. of Bologna, Bologna, Italy
Abstract :
A widely adopted design paradigm for many-core accelerators features processing elements grouped in clusters. Due to area, power and design simplicity, processors in the same clusters are often not equipped with data-caches but rather share a tightly coupled data memory (TCDM). Even if the use of a TCDM is more energy and area efficient than a cache it requires a higher programming effort because memory needs to be explicitly managed with DMA-based L3 to TCDM copies. In this context Software Caches can be used to automatically transfer data between the local TCDM and the external memory, simplifying the task of the programmer. In this paper we present an implementation of Software Cache for the STMicroelectronics STHORM many-core accelerator, featuring a L1 TCDM shared by 16 processors in a cluster. Our main contribution is the design of a fast and thread-safe cache allowing parallel access from different processing elements inside the same cluster. We evaluate our implementation with micro-benchmarks as well as a real world application from the computer vision domain. Results show that a software cache provides major performance improvements with respect to L3 allocation of large data structures even when it is aggressively shared among many parallel threads.
Keywords :
cache storage; data structures; electronic data interchange; multi-threading; multiprocessing systems; DMA-based L3; L1 TCDM; L3 allocation; STMicroelectronics STHORM many-core accelerator; TCDM copy; computer vision domain; data structures; data-caches; design paradigm; external memory; local TCDM; many-core accelerators; micro-benchmarks; parallel access; parallel threads; performance improvements; processing elements; programming effort; software caches; thread-safe cache; thread-safe software cache implementation; tightly coupled data memory; tightly-coupled multicore clusters; transfer data; Data structures; Hardware; Program processors; Radiation detectors; Synchronization;
Conference_Titel :
Application-Specific Systems, Architectures and Processors (ASAP), 2013 IEEE 24th International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
978-1-4799-0494-5
DOI :
10.1109/ASAP.2013.6567591