• DocumentCode
    1933953
  • Title

    PSLP: Padded SLP automatic vectorization

  • Author

    Porpodas, Vasileios ; Magni, Alberto ; Jones, Timothy M.

  • Author_Institution
    Comput. Lab., Univ. of Cambridge, Cambridge, UK
  • fYear
    2015
  • fDate
    7-11 Feb. 2015
  • Firstpage
    190
  • Lastpage
    201
  • Abstract
    The need to increase performance and power efficiency in modern processors has led to a wide adoption of SIMD vector units. All major vendors support vector instructions and the trend is pushing them to become wider and more powerful. However, writing code that makes efficient use of these units is hard and leads to platform-specific implementations. Compiler-based automatic vectorization is one solution for this problem. In particular the Superword-Level Parallelism (SLP) vectorization algorithm is the primary way to automatically generate vector code starting from straight-line scalar code. SLP is implemented in all major compilers, including GCC and LLVM. SLP relies on finding sequences of isomorphic instructions to pack together into vectors. However, this hinders the applicability of the algorithm as isomorphic code sequences are not common in practice. In this work we propose a solution to overcome this limitation. We introduce Padded SLP (PSLP), a novel vectorization algorithm that can vectorize code containing non-isomorphic instruction sequences. It injects a near-minimal number of redundant instructions into the code to transform non-isomorphic sequences into isomorphic ones. The padded instruction sequence can then be successfully vectorized. Our experiments show that PSLP improves vectorization coverage across a number of kernels and full benchmarks, decreasing execution time by up to 63%.
  • Keywords
    parallel processing; program compilers; support vector machines; PSLP; SIMD vector units; compiler-based automatic vectorization; isomorphic instructions; nonisomorphic instruction sequences; padded SLP automatic vectorization; straight-line scalar code; superword-level parallelism vectorization algorithm; vectorization algorithm; vendors support vector instructions; writing code; Algorithm design and analysis; Assembly; Educational institutions; Parallel processing; Program processors; Registers; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Code Generation and Optimization (CGO), 2015 IEEE/ACM International Symposium on
  • Conference_Location
    San Francisco, CA
  • Type

    conf

  • DOI
    10.1109/CGO.2015.7054199
  • Filename
    7054199