• DocumentCode
    1799871
  • Title

    PORPLE: An Extensible Optimizer for Portable Data Placement on GPU

  • Author

    Guoyang Chen ; Bo Wu ; Dong Li ; Xipeng Shen

  • Author_Institution
    Dept. of Comput. Sci., North Carolina State Univ., Raleigh, NC, USA
  • fYear
    2014
  • fDate
    13-17 Dec. 2014
  • Firstpage
    88
  • Lastpage
    100
  • Abstract
    GPU is often equipped with complex memory systems, including globalmemory, texture memory, shared memory, constant memory, and variouslevels of cache. Where to place the data is important for theperformance of a GPU program. However, the decision is difficult for aprogrammer to make because of architecture complexity and thesensitivity of suitable data placements to input and architecturechanges.This paper presents PORPLE, a portable data placement engine thatenables a new way to solve the data placement problem. PORPLE consistsof a mini specification language, a source-to-source compiler, and a runtime data placer. The language allows an easy description of amemory system; the compiler transforms a GPU program into a formamenable to runtime profiling and data placement; the placer, based onthe memory description and data access patterns, identifies on the flyappropriate placement schemes for data and places themaccordingly. PORPLE is distinctive in being adaptive to program inputsand architecture changes, being transparent to programmers (in mostcases), and being extensible to new memory architectures. Ourexperiments on three types of GPU systems show that PORPLE is able toconsistently find optimal or near-optimal placement despite the largedifferences among GPU architectures and program inputs, yielding up to2.08X (1.59X on average) speedups on a set of regular and irregularGPU benchmarks.
  • Keywords
    data handling; graphics processing units; memory architecture; program compilers; specification languages; GPU program; PORPLE; architecture complexity; data access patterns; extensible optimizer; memory architectures; memory description; mini specification language; portable data placement; runtime data placer; runtime profiling; source-to-source compiler; Arrays; Engines; Graphics processing units; Instruction sets; Kernel; Runtime; cache; compiler; data placement; hardware specification language;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Microarchitecture (MICRO), 2014 47th Annual IEEE/ACM International Symposium on
  • Conference_Location
    Cambridge
  • ISSN
    1072-4451
  • Type

    conf

  • DOI
    10.1109/MICRO.2014.20
  • Filename
    7011380