DocumentCode :
2523997
Title :
The effectiveness of loop unrolling for modulo scheduling in clustered VLIW architectures
Author :
Sánchez, Jesus ; González, Antonio
Author_Institution :
Dept. d´´Arquitectura de Computadors, Univ. Politecnica de Catalunya, Barcelona, Spain
fYear :
2000
fDate :
2000
Firstpage :
555
Lastpage :
562
Abstract :
Clustered organizations are becoming a common trend in the design of VLIW architectures. In this work we propose a novel modulo scheduling approach for such architectures. The proposed technique performs the cluster assignment and the instruction scheduling in a single pass, which is shown to be more effective than doing first the assignment and later the scheduling. We also show that loop unrolling significantly enhances the performance of the proposed scheduler especially when the communication channel among clusters is the main performance bottleneck. By selectively unrolling some loops, we can obtain the best performance with the minimum increase in code size. Performance evaluation for the SPECfp95 shows that the clustered architecture achieves about the same IPC (Instructions Per Cycle) as a unified architecture with the same resources. Moreover when the cycle time is taken into account, a 4-cluster configurations is 3.6 times faster than the unified architecture
Keywords :
parallel architectures; performance evaluation; processor scheduling; SPECfp95; cluster assignment; clustered VLIW architectures; clustered architecture; instruction scheduling; loop unrolling; modulo scheduling; performance evaluation; Communication channels; Computer architecture; Continuous improvement; Delay effects; Electronic mail; Pipeline processing; Processor scheduling; Proposals; Registers; VLIW;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 2000. Proceedings. 2000 International Conference on
Conference_Location :
Toronto, Ont.
ISSN :
0190-3918
Print_ISBN :
0-7695-0768-9
Type :
conf
DOI :
10.1109/ICPP.2000.876173
Filename :
876173
Link To Document :
بازگشت