مرکز منطقه ای اطلاع رساني علوم و فناوري - Balancing Locality and Parallelism on Shared-cache Mulit-core Systems

DocumentCode :

2571127

Title :

Balancing Locality and Parallelism on Shared-cache Mulit-core Systems

Author :

Cade, Michael Jason ; Qasem, Apan

Author_Institution :

Texas State Univ., San Marcos, TX, USA

fYear :

2009

fDate :

25-27 June 2009

Firstpage :

188

Lastpage :

195

Abstract :

The emergence of multi-core systems opens new opportunities for thread-level parallelism and dramatically increases the performance potential of applications running on these systems. However, the state of the art in performance enhancing software is far from adequate in regards to the exploitation of hardware features on this complex new architecture. As a result, much of the performance capabilities of multi-core systems are yet to be realized. This research addresses one facet of this problem by exploring the relationship between data-locality and parallelism in the context of multi-core architectures where one or more levels of cache are shared among the different cores. A model is presented for determining a profitable synchronization interval for concurrent threads that interact in a producer-consumer relationship. Experimental results suggest that consideration of the synchronization window, or the amount of work individual threads can be allowed to do between synchronizations, allows for parallelism- and locality-aware performance optimizations. The optimum synchronization window is a function of the number of threads, data reuse patterns within the workload, and the size and configuration of the last-level of cache that is shared among processing units. By considering these factors, the calculation of the optimum synchronization window incorporates parallelism and data locality issues for maximum performance.

Keywords :

cache storage; concurrency control; multi-threading; parallel processing; shared memory systems; concurrent threads; data locality; data reuse pattern; locality-aware performance optimization; multicore architecture; multicore system; optimum synchronization window; parallelism-aware performance optimization; producer-consumer relationship; shared cache; thread-level parallelism; Computer architecture; Costs; Energy consumption; Frequency synchronization; Hardware; Parallel processing; Power system modeling; Software performance; Throughput; Yarn; memory hierarchy optimization; parallelism; performance tuning; shared-cache;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computing and Communications, 2009. HPCC '09. 11th IEEE International Conference on

Conference_Location :

Seoul

Print_ISBN :

978-1-4244-4600-1

Electronic_ISBN :

978-0-7695-3738-2

Type :

conf

DOI :

10.1109/HPCC.2009.61

Filename :

5166993

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2571127