DocumentCode :
1825861
Title :
Architectural Support for Exploiting Fine Grain Parallelism
Author :
Rosas-Ham, Demian ; Herath, Isuru ; Yiapanis, Paraskevas ; Lujan, Mikel ; Watson, Ian
Author_Institution :
Comput. Sci. Sch., Univ. of Manchester, Manchester, UK
fYear :
2012
fDate :
25-27 June 2012
Firstpage :
61
Lastpage :
70
Abstract :
The advent of multi-core processors, particularly with projections that numbers of cores will continue to increase, has focused attention on parallel programming. It is widely recognized that current programming techniques, including those that are used for scientific parallel programming, will not allow the easy formulation of general purpose applications. An area which is receiving interest is the use of programming styles which are side-effect free. Previous work on parallel functional programming demonstrated the potential of this to permit the easy exploitation of parallelism. Recent systems like Cilk use conventional languages such as C but encourage the use of a largely functional style (side-effect free) when writing programs. An important part of the Cilk runtime is a system to balance the usage of cores. In this paper we present SLAM (Spreading Load with Active Messages), a dynamic load balancing system based on functional language evaluation techniques. We show that SLAM, provided with appropriate hardware support, significantly outperforms the Cilk system. We evaluated our system using tiled CMPs with private and shared L2 caches separately. Our results show that, for the benchmarks evaluated, SLAM outperforms Cilk by 28% on average when using 32-core CMPs with private L2 caches. For the case of the CMPs with shared L2 caches, SLAM was on average 21% faster than Cilk when using 32 cores and 62% faster when using 64 cores.
Keywords :
C language; cache storage; microprocessor chips; multiprocessing systems; parallel programming; resource allocation; shared memory systems; 32-core CMP; C languages; Cilk runtime; Cilk system; SLAM; dynamic load balancing system; exploiting fine grain parallelism; functional language evaluation techniques; multicore processors; parallel functional programming; private L2 caches; scientific parallel programming; shared L2 caches; side-effect free; spreading load with active messages; writing programs; Arrays; Hardware; Load management; Parallel processing; Program processors; Registers; Simultaneous localization and mapping; chip multiprocessors; dynamic load balancing; parallel programming; work stealing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
Conference_Location :
Liverpool
Print_ISBN :
978-1-4673-2164-8
Type :
conf
DOI :
10.1109/HPCC.2012.19
Filename :
6332160
Link To Document :
بازگشت