Utilizing horizontal and vertical parallelism with a no-instruction-set compiler for custom datapaths

Author

Reshadi, Mehrdad ; Gorjiara, Bita ; Gajski, Daniel

Author_Institution

Center for Embedded Comput. Syst., California Univ., Irvine, CA, USA

fYear

2005

fDate

2-5 Oct. 2005

Firstpage

Lastpage

Abstract

Performance of programs can be improved by utilizing their horizontal and vertical parallelism. In some processors (VLIW based), compiler can utilize horizontal parallelism by controlling the schedule of independent operations. Vertical parallelism is utilized through pipelining. However, in all processors, structure of pipeline is fixed and compiler has no control over it. In application-specific-instruction set-processors (ASIPs), pipeline structure can be customized and utilized in the program through custom instructions. Practical constraints on the instruction decoder limit the number and complexity of custom instructions in ASIPs. Detecting the frequent and beneficial custom instructions and incorporating them in the compiler are complex and sometimes very time consuming tasks. In this paper, we present an architecture that does not limit the number of custom functionalities that can be implemented on its datapath. Instead of using custom instructions and then relying on the decoder in hardware to generate the control signals, we generate the control signal values in compiler. Since there are no predefined instructions in this architecture, we call it no-instruction-set-computer (NISC). The NISC compiler maps the application directly on the datapath. It has complete fine grain control over datapath and hence can very well utilize resources in the hardware as well as horizontal and vertical parallelism in the program. We also explain the algorithm for mapping the CDFG of a program on a given datapath in NISC. Using our algorithm and a NISC architecture with the datapath of a MIPS, we achieved up to 70% speedup over the traditional MIPS compiler. In another experiment, we started from a base architecture and customized it by adding resources and interconnect to increase its horizontal and vertical parallelism. The algorithm achieved up to 15.5 times speedup by utilizing the available parallelism in the program and the datapath.

Keywords

application specific integrated circuits; logic design; microprocessor chips; parallel architectures; parallelising compilers; pipeline processing; application-specific-instruction set-processors; control signals; horizontal parallelism; instruction decoder; no-instruction-set compiler; no-instruction-set-computer; pipeline structure; pipelining; vertical parallelism; Application specific processors; Concurrent computing; Decoding; Embedded computing; Hardware; Parallel processing; Pipeline processing; Processor scheduling; Signal generators; VLIW;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Design: VLSI in Computers and Processors, 2005. ICCD 2005. Proceedings. 2005 IEEE International Conference on

Print_ISBN

0-7695-2451-6

Type

conf

DOI

10.1109/ICCD.2005.112

Filename

1524131

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=2281815