CHOMP: A Framework and Instruction Set for Latency Tolerant, Massively Multithreaded Processors

Author

Leidel, John D. ; Wadleigh, Kevin ; Bolding, Joe ; Brewer, Tony ; Walker, David

Author_Institution

Convey Comput. Corp., Richardson, TX, USA

fYear

2012

fDate

10-16 Nov. 2012

Firstpage

232

Lastpage

239

Abstract

Given the recent advent of the multicore era [1], we find that parallel application performance is no longer solely gated by an architecture´s core arithmetic unit performance. Memory bandwidth has failed to grow at the same rate as effective core density. This paper presents a framework for constructing tightly coupled, chip-multithreading [CMT] processors that contain specific features well-suited to hiding latency to main memory and executing highly concurrent applications. This framework, deemed the “Convey Hybrid OpenMP” or CHOMP architecture, is built around a RISC instruction set that permits the hardware and software runtime mechanisms to participate in efficient scheduling of concurrent application workloads regardless of the distribution and type of instructions utilized. In this manner, all instructions in CHOMP have the ability to participate in the concurrency algorithms present in the hardware scheduler that drive context switch events. This, coupled with a set of hardware supported extended memory semantic instructions, means that the CHOMP architecture is well suited to executing applications that access memory using non-unit stride or irregular access patterns. Furthermore, the CHOMP architecture and framework contains specific logic and instruction set support that allows application-level, dynamic power gating of individual register files and function pipes.

Keywords

instruction sets; microprocessor chips; multi-threading; multiprocessing systems; parallel processing; CHOMP; CHOMP architecture; CMT processors; RISC instruction set; chip multithreading; context switch events; convey hybrid OpenMP; core density; hardware scheduler; instruction set; latency tolerant; massively multithreaded processors; memory bandwidth; multicore era; parallel application performance; CMT; FPGA; OpenMP; RISC; concurrency; heterogeneous architecture; manycore; multicore;

fLanguage

English

Publisher

ieee

Conference_Titel

High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:

Conference_Location

Salt Lake City, UT

Print_ISBN

978-1-4673-6218-4

Type

conf

DOI

10.1109/SC.Companion.2012.39

Filename

6495821