• DocumentCode
    3200426
  • Title

    Scalarization on Short Vector Machines

  • Author

    Zhao, Yuan ; Kennedy, Ken

  • Author_Institution
    Dept. of Comput. Sci., Rice Univ., Houston, TX
  • fYear
    2005
  • fDate
    20-22 March 2005
  • Firstpage
    187
  • Lastpage
    196
  • Abstract
    Scalarization is a process that converts array statements into loop nests so that they can run on a scalar machine. One technical difficulty of scalarization is that temporary storage often needs to be allocated in order to preserve the semantics of array syntax - "fetch before store". Many techniques have been developed to reduce the size of temporary storage requirement in order to improve the memory hierarchy performance. With the emergence of short vector units on modern microprocessors, it is interesting to see how to extend the preexisting scalarization methods so that the underlying vector infrastructure is fully utilized, while at the same time keep the temporary storage minimized. In this paper, we extend a loop alignment algorithm for scalarization on short vector machines. The revised algorithm not only achieves vector execution with minimum temporary storage, but also handles data alignment properly, which is very important for performance. Our experiments on two types of widely available architectures demonstrate the effectiveness of our strategy
  • Keywords
    instruction sets; parallel architectures; program control structures; storage allocation; vector processor systems; SIMD; array statements; array syntax; data alignment properly; loop alignment algorithm; loop nests; memory hierarchy performance; microprocessors; scalarization; short vector machines; temporary storage; vectorized scalar replacement; Cache storage; Computer science; Microprocessors; Prefetching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Performance Analysis of Systems and Software, 2005. ISPASS 2005. IEEE International Symposium on
  • Conference_Location
    Austin, TX
  • Print_ISBN
    0-7803-8965-4
  • Type

    conf

  • DOI
    10.1109/ISPASS.2005.1430573
  • Filename
    1430573