Mamba: A scalable communication centric multi-threaded processor architecture

Author

Chadwick, Gregory A. ; Moore, Simon W.

Author_Institution

Comput. Lab., Univ. of Cambridge, Cambridge, UK

fYear

2012

fDate

Sept. 30 2012-Oct. 3 2012

Firstpage

277

Lastpage

283

Abstract

In this paper we describe Mamba, an architecture designed for multi-core systems. Mamba has two major aims: (i) make on-chip communication explicit to the programmer so they can optimize for it and (ii) support many threads and supply very lightweight communication and synchronization primitives for them. These aims are based on the observations that: (i) as feature sizes shrink, on-chip communication becomes relatively more expensive than computation and (ii) as we go increasingly multi-core we need highly scalable approaches to inter-thread communication and synchronization. We employ a network of processors where a given memory access will always go to the same cache, removing the need for a coherence protocol and allowing the program explicit control over all communication. A presence bit associated with each word provides a very lightweight, finegrained synchronization primitive. We demonstrate an FPGA implementation with micro-benchmarks of standard spinlock and FIFO implementations and show that presence bit based implementations provide more efficient locking, and lower latency FIFO communications compared to a conventional shared memory implementation whilst also requiring fewer memory accesses. We also show that Mamba performance is insensitive to total thread count, allowing the use of as many threads as desired.

Keywords

cache storage; field programmable gate arrays; multi-threading; multiprocessing systems; parallel memories; queueing theory; synchronisation; FIFO; FPGA; Mamba; bit based implementation; cache storage; fine grained synchronization primitive; interthread communication; lightweight communication; memory access; multicore system; multithreaded processor architecture; on-chip communication; optimization; scalable communication; Benchmark testing; Computer architecture; Field programmable gate arrays; Instruction sets; Message systems; Registers;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Design (ICCD), 2012 IEEE 30th International Conference on

Conference_Location

Montreal, QC

ISSN

1063-6404

Print_ISBN

978-1-4673-3051-0

Type

conf

DOI

10.1109/ICCD.2012.6378652

Filename

6378652