• DocumentCode
    2490749
  • Title

    Efficient address generation for affine subscripts in data-parallel programs

  • Author

    Shih, Kuei-Ping ; Sheu, Jang-Ping ; Chang, Chih-Yung

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Central Univ., Chung-Li, Taiwan
  • fYear
    1998
  • fDate
    14-16 Dec 1998
  • Firstpage
    758
  • Lastpage
    765
  • Abstract
    This paper presents an efficient compilation technique to generate the local memory access sequences for block-cyclically distributed array references with affine subscripts in data-parallel programs. For the memory accesses of an array reference with affine subscript within a two-nested loop, there exist repetitive patterns both at the outer and inner loops. We use tables to record the memory accesses of repetitive patterns. According to these tables, a new start-computation algorithm is proposed to compute the starting elements on a processor for each outer loop iteration. The complexities of the table constructions are O(k+s2), where k is the distribution block size and s2 is the access stride for the inner loop. After tables are constructed, generating each starting element for each outer loop iteration can run in O(1) time. Moreover, we also show that the repetitive iterations for outer loop are Pk/gcd(Pk,s1), where P is the number of processors and s1 is the access stride for the outer loop. Therefore, the total complexity to generate the local memory access sequences for a block-cyclically distributed array with affine subscript in a two-nested loop is O(Pk/gcd/(Pk,s1)+k+s 2)
  • Keywords
    computational complexity; data structures; distributed memory systems; parallel programming; program compilers; program control structures; access stride; address generation; affine subscripts; array reference; block-cyclically distributed array references; compilation technique; complexity; data-parallel programs; local memory access sequences; loop iteration; memory access; repetitive patterns; start-computation algorithm; starting element; tables; two-nested loop; Computer science; Concurrent computing; Data engineering; Distributed computing; Educational institutions; Electronic mail; Induction generators; Parallel programming; Program processors; Programming profession;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Systems, 1998. Proceedings. 1998 International Conference on
  • Conference_Location
    Tainan
  • ISSN
    1521-9097
  • Print_ISBN
    0-8186-8603-0
  • Type

    conf

  • DOI
    10.1109/ICPADS.1998.741165
  • Filename
    741165