• DocumentCode
    3728577
  • Title

    Fusion of Calling Sites

  • Author

    Douglas do Couto Teixeira;Sylvain Collange;Fernando Magno Quint?o

  • Author_Institution
    Dept. de Cienc. da Comput., Univ. Fed. de Minas Gerais, Belo Horizonte-MG, Brazil
  • fYear
    2015
  • Firstpage
    90
  • Lastpage
    97
  • Abstract
    The increasing popularity of Graphics Processing Units (GPUs), has brought renewed attention to old problems related to the Single Instruction, Multiple Data execution model. One of these problems is the reconvergence of divergent threads. A divergence happens at a conditional branch when different threads disagree on the path to follow upon reaching this split point. Divergences may impose a heavy burden on the performance of parallel programs. In this paper we propose a compiler level optimization to mitigate this performance loss. This optimization consists in merging function call sites located at different paths that sprout from the same branch. We show that our optimization adds negligible overhead on the compiler. It does not slowdown programs in which it is not applicable, and accelerates substantially those in which it is. As an example, we have been able to speed up the well known SPLASH Fast Fourier Transform benchmark by 11%.
  • Keywords
    "Optimization","Instruction sets","Merging","Hardware","Computer architecture","Benchmark testing","Graphics processing units"
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2015 27th International Symposium on
  • ISSN
    1550-6533
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2015.16
  • Filename
    7379838