• DocumentCode
    1954643
  • Title

    Modestly faster histogram computations on GPUs

  • Author

    Brown, Shawn ; Snoeyink, Jack

  • Author_Institution
    UNC Chapel Hill, Columbia, NC, USA
  • fYear
    2012
  • fDate
    13-14 May 2012
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    We present TRISH, a 256-bin histogram method for byte data that runs up to 50% faster than previous GPU methods for random data and 2-4× faster for image data. The performance gains come from reducing total cycle counts. Reducing cycles comes from improving 1) thread level parallelism (TLP), 2) instruction level parallelism (ILP) and 3) software vector parallelism (VP). TLP is improved by increasing occupancy from 2 to 3 thread blocks, achieved by compacting “per thread” histograms in shared memory, and by using register arrays. ILP is improved by increasing independent instructions via loop unrolling by a factor of k= [1..63] and batching operations into groups of four. VP is supported by compacting bin counts into four 8-bit quads per 32-bit element and reducing binning & accumulating instructions by working with 32-bit elements as overlapping 16-bit pairs instead of 4 individual bytes. Note that TRISH is a deterministic algorithm that avoids atomic operations and gives performance that is data independent.
  • Keywords
    deterministic algorithms; graphics processing units; shared memory systems; 256-bin histogram method; GPU; ILP; TLP; TRISH; VP; binning reduction; byte data; deterministic algorithm; histogram computations; image data; instruction accumulation; instruction level parallelism; loop unrolling; random data; register arrays; shared memory; software vector parallelism; thread blocks; thread level parallelism; total cycle count reduction; Abstracts; Buildings; Kernel; Registers; Throughput; CUDA; GPU; Histogram; Parallel Processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovative Parallel Computing (InPar), 2012
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    978-1-4673-2632-2
  • Electronic_ISBN
    978-1-4673-2631-5
  • Type

    conf

  • DOI
    10.1109/InPar.2012.6339589
  • Filename
    6339589