Title :
Analyzing the Efficiency and Bottleneck of Scientific Programs on Imagine Stream Processor by Simulation
Author :
Che, Yonggang ; Xu, Chuanfu ; Wang, Zhenghua
Author_Institution :
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha
Abstract :
Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to the high performance computing community. This paper investigates this subject from a new angle. It roughly classifies the scientific programs into three classes based on their computation to memory access ratios. For each class, typical programs are programmed with StreamC/KernelC stream language and simulated based on the cycle-accurate simulator of Imagine. In-depth analysis is carried out for the performance data, with special attentions on the performance bottlenecks. The performance data obtained on Imagine are compared against data on two general-purpose x86 processors. The results show that programs with no DRAM accesses attain high floating point performance and efficiencies on Imagine. These programs´ performance is only restricted by limited ILP (Instruction-Level Parallelism) and load imbalance across ALUs. Programs with computation to memory operation ratios O(n) attain absolute floating point performance on Imagine comparable to that obtained on general-purpose processors, but their floating-point efficiencies are not satisfactory. It is essential to optimize these programs for high SRF (Stream Register File) and LRF (Local Register File) reuse and high ILP on Imagine. Programs with lower computation to memory operation ratios attain much lower floating-point performance and efficiencies on Imagine, compared to those obtained on x86 processors.
Keywords :
microcomputers; natural sciences computing; parallel architectures; performance evaluation; KernelC stream language; StreamC stream language; cycle accurate simulator; floating-point efficiencies; general-purpose x86 processors; high performance computing community; imagine stream processor; instruction-level parallelism; local register file reuse; media application; scientific application; scientific programs; stream register file; Analytical models; Application software; Arithmetic; Bandwidth; Computational modeling; Computer architecture; High performance computing; Image analysis; Scientific computing; Streaming media; Imagine stream processor; floating-point efficiency; performance bottleneck; performance evaluation; scientific applications;
Conference_Titel :
Parallel and Distributed Processing with Applications, 2008. ISPA '08. International Symposium on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3471-8
DOI :
10.1109/ISPA.2008.17