• DocumentCode
    1997387
  • Title

    Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors

  • Author

    Xinmin Tian ; Saito, Hiroshi ; Preis, Serguei V. ; Garcia, Eric N. ; Kozhukhov, Sergey S. ; Masten, Michael K. ; Cherkasov, Aleksei G. ; Panchenko, Nikolay

  • Author_Institution
    Mobile Comput. & Compilers Software & Service Group, Intel Corp., Santa Clara, CA, USA
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    1149
  • Lastpage
    1158
  • Abstract
    Intel® Xeon Phi coprocessor is based on the Intel® Many Integrated Core (Intel® MIC) architecture, which is an innovative new processor architecture that combines abundant thread parallelism with long SIMD vector units. Efficiently exploiting SIMD vector units is one of the most important aspects in achieving high performance of the application code running on Intel® Xeon Phi coprocessors. In this paper, we present several practical SIMD vectorization techniques such as less-than-full-vector loop vectorization, Intel® MIC specific alignment optimization, and small matrix transpose/multiplication 2-D vectorization implemented in the Intel® C/C++ and Fortran production compilers for Intel® Xeon Phi coprocessors. A set of workloads from several application domains is employed to conduct the performance study of our SIMD vectorization techniques. The performance results show that we achieved up to 12.5x performance gain on the Intel® Xeon Phi coprocessor.
  • Keywords
    FORTRAN; coprocessors; optimisation; parallel architectures; Fortran production compilers; Intel Many Integrated Core architecture; Intel Xeon Phi coprocessors; SIMD vectorization techniques; innovative new processor architecture; small matrix transpose-multiplication 2-D vectorization; Computer architecture; Coprocessors; Microwave integrated circuits; Optimization; Parallel processing; Registers; Vectors; Intel® MIC Architecture; Intel® Xeon Phi coprocessor; SIMD vectorization; compiler optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
  • Conference_Location
    Cambridge, MA
  • Print_ISBN
    978-0-7695-4979-8
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2013.245
  • Filename
    6651001