• DocumentCode
    2502939
  • Title

    Vector Lane Threading

  • Author

    Rivoire, Suzanne ; Schultz, Rebecca ; Okuda, Tomofumi ; Kozyrakis, Christos

  • Author_Institution
    Dept. of Electr. Eng., Stanford Univ., Palo Alto, CA
  • fYear
    2006
  • fDate
    14-18 Aug. 2006
  • Firstpage
    55
  • Lastpage
    64
  • Abstract
    Multi-lane vector processors achieve excellent computational throughput for programs with high data-level parallelism (DLP). However, application phases without significant DLP are unable to fully utilize the datapaths in the vector lanes. In this paper, we propose vector lane threading (VLT), an architectural enhancement that allows idle vector lanes to run short-vector or scalar threads. VLT-enhanced vector hardware can exploit both data-level and thread-level parallelism to achieve higher performance. We investigate implementation alternatives for VLT, focusing mostly on the instruction issue bandwidth requirements. We demonstrate that VLT´s area overhead is small. For applications with short vectors, VLT leads to additional speedup of IA to 23 over the base vector design. For scalar threads, VLT outperforms a 2-way CMP design by a factor of two. Overall, VLT allows vector processors to reach high computational throughput for a wider range of parallel programs and become a competitive alternative to CMP systems
  • Keywords
    parallel architectures; vector processor systems; data-level parallelism; multilane vector processors; thread-level parallelism; vector lane threading; Bandwidth; Concurrent computing; Hardware; Multithreading; Oceans; Parallel processing; Telecommunication computing; Throughput; Vector processors; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2006. ICPP 2006. International Conference on
  • Conference_Location
    Columbus, OH
  • ISSN
    0190-3918
  • Print_ISBN
    0-7695-2636-5
  • Type

    conf

  • DOI
    10.1109/ICPP.2006.74
  • Filename
    1690605