DocumentCode :
3130177
Title :
Extended split-issue: enabling flexibility in the hardware implementation of NUAL VLIW DSPs
Author :
Iyer, Bharath ; Srinivasan, Sadagopan ; Jacob, Bruce
Author_Institution :
Dept. of Electr. & Comput. Eng., Maryland Univ., College Park, MD, USA
fYear :
2004
fDate :
19-23 June 2004
Firstpage :
364
Lastpage :
375
Abstract :
VLIW architecture based DSPs have become widespread due to the combined benefits of simple hardware and compiler-extracted instruction-level parallelism. However, the VLIW instruction set architecture and its hardware implementation are tightly coupled, especially so for Non-Unit Assumed Latency (NUAL) VLIWs. The problem of object code compatibility across processors having different numbers of functional units or hardware latencies has been the Achilles´ heel of this otherwise powerful architecture. In this paper, we propose eXtended Split-Issue (XSI), a novel mechanism that breaks the instruction packet syntax of an NUAL VLIW compiler without violating the dataflow dependences. XSI provides a designer the freedom of disassociating the hardware implementation of the NUAL VLIW processor from the instruction set architecture. Further, we investigate fairly radical (in the context of VLIW) changes to the hardware-like removing an adder, adding a multiplier, and incorporating simultaneous multithreading (SMT) - to show that our technique works for a variety of hardware configurations without compromising on performance. The technique can be used in both single-threaded and multi-threaded architectures to achieve a level of flexibility heretofore unavailable in the VLIW arena.
Keywords :
digital signal processing chips; instruction sets; multi-threading; multiprocessing systems; parallel architectures; parallel machines; program compilers; NUAL VLIW DSP; NUAL VLIW compiler; NUAL VLIW processor; VLIW architecture; VLIW instruction set architecture; compiler-extracted instruction-level parallelism; dataflow dependences; extended split-issue; hardware configurations; hardware implementation; hardware latencies; hardware parallelism; instruction packet syntax; multithreaded architecture; nonunit assumed latency; object code compatibility; simultaneous multithreading; single-threaded architecture; Computer architecture; Costs; Delay; Digital signal processing; Hardware; Instruction sets; Jacobian matrices; Program processors; Programming profession; VLIW;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture, 2004. Proceedings. 31st Annual International Symposium on
ISSN :
1063-6897
Print_ISBN :
0-7695-2143-6
Type :
conf
DOI :
10.1109/ISCA.2004.1310788
Filename :
1310788
Link To Document :
بازگشت