DocumentCode
1685802
Title
Using hardware multithreading to overcome broadcast/reduction latency in an associative SIMD processor
Author
Schaffer, Kevin ; Walker, Robert A.
Author_Institution
Dept. of Comput. Sci., Kent State Univ., Kent, OH
fYear
2008
Firstpage
1
Lastpage
7
Abstract
The latency of broadcast/reduction operations has a significant impact on the performance of SIMD processors. This is especially true for associative programs, which make extensive use of global search operations. Previously, we developed a prototype associative SIMD processor that uses hardware multithreading to overcome the broadcast/reduction latency. In this paper we show, through simulations of the processor running an associative program, that hardware multithreading is able to improve performance by increasing system utilization, even for processors with hundreds or thousands of processing elements. However, the choice of thread scheduling policy used by the hardware is critical in determining the actual utilization achieved. We consider three thread scheduling policies and show that a thread scheduler that avoids issuing threads that will stall due to pipeline dependencies or thread synchronization operations is able to maintain system utilization independent of the number of threads.
Keywords
multi-threading; parallel processing; scheduling; associative SIMD processor; broadcast/reduction latency; hardware multithreading; thread synchronization operations; Broadcasting; Computer aided instruction; Computer science; Concurrent computing; Delay; Hardware; Multithreading; Processor scheduling; Synchronization; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
Conference_Location
Miami, FL
ISSN
1530-2075
Print_ISBN
978-1-4244-1693-6
Electronic_ISBN
1530-2075
Type
conf
DOI
10.1109/IPDPS.2008.4536349
Filename
4536349
Link To Document