DocumentCode :
2042209
Title :
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture
Author :
Zhang, Ying Ping ; Jeon, Taikyeong ; Chen, Fei ; Wu, Haiping ; Nitzsche, Ronny ; Gao, Guang R.
Author_Institution :
Dept. of Electr. & Comput. Eng., Delaware Univ., Newark, DE, USA
fYear :
2006
fDate :
25-29 April 2006
Abstract :
The designs of high-performance processor architectures are moving toward the integration of a large number of multiple processing cores on a single chip. The IBM Cyclops-64 (C64) is a petaflop supercomputer built on multi-core system-on-a-chip technology. Each C64 chip employs a multistage pipelined crossbar switch as its on-chip interconnection network to provide high bandwidth and low latency communication between the 160 thread processing cores, the on-chip SRAM memory banks, and other components. In this paper, we present a study of the architecture and performance of the C64 on-chip interconnection network through simulation. Our experimental results provide observations on the network behavior: (1) Dedicated channels can be created between any output port to input port of the C64 crossbar with latency as low as 7 cycles. The C64 crossbar has the potential reach the full hardware bandwidth, and exhibit a non-blocking behavior; (2) The C64 crossbar is a stable network; (3) The network logic design appears to provide a reasonable opportunity for sharing the channel bandwidth between traffic in either direction; (4) A simple circular neighbor arbitration scheme can achieve competitive performance level comparing to the complex segmented LRU (least recently used) matrix arbitration scheme without losing the fairness. (5) Application-driven benchmarks provide comparable results to synthetic workloads.
Keywords :
mainframes; multiprocessor interconnection networks; parallel machines; system-on-chip; IBM Cyclops64 multicore architecture; circular neighbor arbitration scheme; high-performance processor architecture; multicore system-on-a-chip technology; network logic design; on-chip interconnection network; petaflop supercomputer; Bandwidth; Communication switching; Delay; Multiprocessor interconnection networks; Network-on-a-chip; Process design; Supercomputers; Switches; System-on-a-chip; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Print_ISBN :
1-4244-0054-6
Type :
conf
DOI :
10.1109/IPDPS.2006.1639301
Filename :
1639301
Link To Document :
بازگشت