• DocumentCode
    1266615
  • Title

    Can GPGPU Programming Be Liberated from the Data-Parallel Bottleneck?

  • Author

    Gaster, Benedict R. ; Howes, Lee

  • Volume
    45
  • Issue
    8
  • fYear
    2012
  • fDate
    8/1/2012 12:00:00 AM
  • Firstpage
    42
  • Lastpage
    52
  • Abstract
    With the growth in transistor counts in modern hardware, heterogeneous systems are becoming commonplace. Core counts are increasing such that GPU and CPU designs are reaching deep into the tens of cores. For performance reasons, different cores in a heterogeneous platform follow different design choices. Based on throughput computing goals, GPU cores tend to support wide vectors and substantial register files. Current designs optimize CPU cores for latency, dedicating logic to caches and out-of-order dependence control. Heterogeneous parallel primitives (HPP) addresses two major shortcomings in current GPGPU programming models: it supports full composability by defining abstractions and increases flexibility in execution by introducing braided parallelism. Heterogeneous parallel primitives is an object-oriented, C++11-based programming model that addresses these shortcomings on both CPUs and massively multithreaded GPUs: it supports full composability by defining abstractions using distributed arrays and barrier objects, and it increases flexibility in execution by introducing braided parallelism. This paper implemented a feature-complete version of HPP, including all syntactic constructs, that runs on top of a task-parallel runtime executing on the CPU. They continue to develop and improve the model, including reducing overhead due to channel management, and plan to make a public version available sometime in the future.
  • Keywords
    C++ language; graphics processing units; integrated circuit design; multi-threading; object-oriented programming; CPU designs; GPGPU programming; GPU designs; barrier objects; braided parallelism; channel management; core counts; data-parallel bottleneck; distributed arrays; general-purpose computing-on-graphics processing units; heterogeneous parallel primitives; heterogeneous systems; object-oriented C++11-based programming model; task-parallel runtime; Graphics processing unit; Hardware; Indexes; Multithreading; Parellel processing; Performance evaluation; Programming; GPGPU programming; braided parallelism; data-parallel execution; distributed arrays; hardware; heterogeneous parallel primitives; massively threaded computing systems; persistent threading;
  • fLanguage
    English
  • Journal_Title
    Computer
  • Publisher
    ieee
  • ISSN
    0018-9162
  • Type

    jour

  • DOI
    10.1109/MC.2012.257
  • Filename
    6272260