DocumentCode :
167385
Title :
Prototyping the MBTAC Processor for the REPLICA CMP
Author :
Forsell, Martti ; Roivainen, Jussi ; Leppanen, Ville
Author_Institution :
VTT Tech. Res. Centre of Finland, Oulu, Finland
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
709
Lastpage :
716
Abstract :
Current chip multiprocessors (CMP) have mostly been designed by replicating sequential/single core processors and providing some support for operating them with a shared memory. As a result of this, they define asynchronous computational model of threads, often require maximizing the locality of memory references to get decent performance, and feature high intercommunication overheads, that make parallel programming tedious for general purpose functionalities. Most of these problems can be eliminated by designing the processors architecture for scalable general purpose computing from the very beginning like done in processors for configurable emulated shared memory (CESM) CMPs. They provide support for machine instruction-level synchronization, make use of multithreading to support latency-insensitive computation, and promote the concept of uniform synchronous shared memory for easy variable allocation and convenient data exchange. In our earlier work we have proposed the first CESM architecture TOTAL ECLIPSE composed of early MBTAC processors making use of very low-overhead multithreading, parallel computing savvy functional unit organization, support for fast synchronization between the instructions and threads, and highly efficient multioperations. Unfortunately, certain key parts of these processors turned out to be hardly implementable and overall they lacked support for ordered multiprefix operations and full configurability of the CESM scheme. In this paper we introduce a new fully configurable version of the MBTAC-processor for our new REPLICA CESM architecture and the first FPGA implementations of it. To evaluate it, we execute short test programs on it and compare it preliminary against Intel Core i7 and DLX processors. Our FPGA design flow and testing approach are described.
Keywords :
field programmable gate arrays; microprocessor chips; multi-threading; parallel programming; shared memory systems; synchronisation; CESM CMP; CESM architecture TOTAL ECLIPSE; CESM scheme; DLX processors; FPGA design flow approach; FPGA implementations; FPGA testing approach; Intel Core i7 processors; MBTAC processor prototyping; MBTAC processors; REPLICA CESM architecture; REPLICA CMP; asynchronous thread computational model; chip multiprocessors; configurable emulated shared memory CMP; intercommunication overheads; latency-insensitive computation; low-overhead multithreading; machine instruction-level synchronization; memory references; multiprefix operations; parallel computing savvy functional unit organization; parallel programming; scalable general purpose computing; sequential core processors; single core processors; uniform synchronous shared memory; Field programmable gate arrays; Instruction sets; Memory management; Phase change random access memory; Prototypes; Synchronization; FPGA prototype; NUMA; PRAM; chaining; multithreded processor; parallel computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
Type :
conf
DOI :
10.1109/IPDPSW.2014.82
Filename :
6969452
Link To Document :
بازگشت