• DocumentCode
    2297957
  • Title

    The effects of explicitly parallel mechanisms on the multi-ALU processor cluster pipeline

  • Author

    Chang, Andrew ; Dally, William J. ; Keckler, Stephen W. ; Carter, Nicholas P. ; Lee, Whay S.

  • Author_Institution
    Comput. Syst. Lab., Stanford Univ., CA, USA
  • fYear
    1998
  • fDate
    5-7 Oct 1998
  • Firstpage
    474
  • Lastpage
    481
  • Abstract
    Continuing reductions in on-chip geometries yield increasing numbers of transistors per chip and fundamentally faster devices but also result in effectively slower wires. This combination presents significant challenges for new microprocessor architectures. The disparity in performance between on-chip arithmetic units and memory creates longer effectively latencies. The changing balance between gate delay and wire delay penalizes global interactions. The MIT Multi-ALUP processor (IMRP) architecture incorporates three explicitly parallel mechanisms to address these challenges. Efficient intercluster interactions enable instruction scheduling across clustered arithmetic units. Deferred exceptions based on ERRVAL´s facilitate aggressive instruction reordering and speculation. Zero-cycle multithreading provides latency tolerance without sacrificing single threaded performance. In this paper; we describe each of these mechanisms and quantify their impact on the area and routing of the cluster pipeline in the 5 Million transistor MAP chip. Zero-cycle multithreading accounts for over 44% of the total cluster area. Support for ERRVAL´s requires very little area (less than 4%). The intercluster interaction mechanisms require minimal cluster area and less than 5% of the available global routing resources, but enable fully general access across clusters and between all arithmetic units
  • Keywords
    parallel architectures; performance evaluation; pipeline processing; MIT Multi-ALUP processor; cluster pipeline; explicitly parallel mechanisms; instruction scheduling; latency tolerance; microprocessor architectures; multithreading; parallel mechanisms; Arithmetic; Communication switching; Delay; Laboratories; Multithreading; Pipelines; Processor scheduling; Routing; Switches; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Design: VLSI in Computers and Processors, 1998. ICCD '98. Proceedings. International Conference on
  • Conference_Location
    Austin, TX
  • ISSN
    1063-6404
  • Print_ISBN
    0-8186-9099-2
  • Type

    conf

  • DOI
    10.1109/ICCD.1998.727091
  • Filename
    727091