• DocumentCode
    129182
  • Title

    Process variation-aware workload partitioning algorithms for GPUs supporting spatial-multitasking

  • Author

    Aguilera, Pedro ; Jungseob Lee ; Farmahini-Farahani, Amin ; Morrow, Katherine ; Schulte, Michael ; Nam Sung Kim

  • Author_Institution
    Univ. of Wisconsin - Madison, Madison, WI, USA
  • fYear
    2014
  • fDate
    24-28 March 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    High-level programming languages have transformed graphics processing units (GPUs) from domain-restricted devices into powerful compute platforms. Yet many “generalpurpose GPU” (GPGPU) applications fail to fully utilize the GPU resources. Executing multiple applications simultaneously on different regions of the GPU (spatial multitasking) thus improves system performance. However, within-die process variations lead to significantly different maximum operating frequencies (Fmax) of the streaming multiprocessors (SMs) within a GPU. As the chip size and number of SMs per chip increase, the frequency variation is also expected to increase, exacerbating the problem. The increased number of SMs also provides a unique opportunity: we can allocate resources to concurrently-executing applications based on how those applications are affected by the different available Fmax values. In this paper, we study the effects of per-SM clocking on spatial multitasking-capable GPUs. We demonstrate two factors that affect the performance of simultaneously-running applications: (i) the SM partitioning algorithm that decides how many resources to assign to each application, and (ii) the assignment of SMs to applications based on the operating frequencies of those SMs and the applications characteristics. Our experimental results show that spatial multitasking that partitions SMs based on application characteristics, when combined with per-SM clocking, can greatly improve application performance by up to 46% on average compared to cooperative multitasking with global clocking.
  • Keywords
    graphics processing units; high level languages; multiprogramming; GPU; graphics processing units; high level programming languages; maximum operating frequencies; process variation aware workload partitioning algorithms; spatial multitasking; streaming multiprocessors; within die process variations; Business process re-engineering; Clocks; Frequency control; Graphics processing units; Kernel; Multitasking; Partitioning algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014
  • Conference_Location
    Dresden
  • Type

    conf

  • DOI
    10.7873/DATE.2014.189
  • Filename
    6800390