• DocumentCode
    639331
  • Title

    Automatic OpenCL work-group size selection for multicore CPUs

  • Author

    Jungju Oh ; Zajic, Alenka ; Prvulovic, Milos

  • Author_Institution
    Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA
  • fYear
    2013
  • fDate
    7-11 Sept. 2013
  • Firstpage
    387
  • Lastpage
    398
  • Abstract
    Growth in core count creates an increasing demand for interconnect bandwidth, driving a change from shared buses to packet-switched on-chip interconnects. However, this increases the latency between cores separated by many links and switches. In this paper, we show that a low-latency unswitched interconnect built with transmission lines can be synergistically used with a high-throughput switched interconnect. First, we design a broadcast ring as a chain of unidirectional transmission line structures with very low latency but limited throughput. Then, we create a new adaptive packet steering policy that judiciously uses the limited throughput of this ring by balancing expected latency benefit and ring utilization. Although the ring uses 1.3% of the on-chip metal area, our experimental results show that, in combination with our steering, it provides an execution time reduction of 12.4% over a mesh-only baseline.
  • Keywords
    field buses; integrated circuit interconnections; network-on-chip; packet switching; switches; transmission lines; adaptive packet steering policy; broadcast ring; core count; execution time reduction; high-throughput switched interconnect; high-throughput switched on-chip interconnect; interconnect bandwidth; latency benefit; links; low-latency unswitched TL ring; low-latency unswitched interconnect; mesh-only baseline; network-on-chip; on-chip metal area; packet-switched on-chip interconnects; ring utilization; shared buses; traffic steering; transmission lines; unidirectional transmission line structures; Couplers; Delays; Receivers; Switches; Throughput; Transmitters; Wires; OpenCL; automatic selection; multicore CPU; performance portability; profiling; work-group size; working-set;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architectures and Compilation Techniques (PACT), 2013 22nd International Conference on
  • Conference_Location
    Edinburgh
  • ISSN
    1089-795X
  • Print_ISBN
    978-1-4799-1018-2
  • Type

    conf

  • DOI
    10.1109/PACT.2013.6618827
  • Filename
    6618827