• DocumentCode
    2010129
  • Title

    Analyzing the Efficiency and Bottleneck of Scientific Programs on Imagine Stream Processor by Simulation

  • Author

    Che, Yonggang ; Xu, Chuanfu ; Wang, Zhenghua

  • Author_Institution
    Sch. of Comput., Nat. Univ. of Defense Technol., Changsha
  • fYear
    2008
  • fDate
    10-12 Dec. 2008
  • Firstpage
    89
  • Lastpage
    98
  • Abstract
    Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to the high performance computing community. This paper investigates this subject from a new angle. It roughly classifies the scientific programs into three classes based on their computation to memory access ratios. For each class, typical programs are programmed with StreamC/KernelC stream language and simulated based on the cycle-accurate simulator of Imagine. In-depth analysis is carried out for the performance data, with special attentions on the performance bottlenecks. The performance data obtained on Imagine are compared against data on two general-purpose x86 processors. The results show that programs with no DRAM accesses attain high floating point performance and efficiencies on Imagine. These programs´ performance is only restricted by limited ILP (Instruction-Level Parallelism) and load imbalance across ALUs. Programs with computation to memory operation ratios O(n) attain absolute floating point performance on Imagine comparable to that obtained on general-purpose processors, but their floating-point efficiencies are not satisfactory. It is essential to optimize these programs for high SRF (Stream Register File) and LRF (Local Register File) reuse and high ILP on Imagine. Programs with lower computation to memory operation ratios attain much lower floating-point performance and efficiencies on Imagine, compared to those obtained on x86 processors.
  • Keywords
    microcomputers; natural sciences computing; parallel architectures; performance evaluation; KernelC stream language; StreamC stream language; cycle accurate simulator; floating-point efficiencies; general-purpose x86 processors; high performance computing community; imagine stream processor; instruction-level parallelism; local register file reuse; media application; scientific application; scientific programs; stream register file; Analytical models; Application software; Arithmetic; Bandwidth; Computational modeling; Computer architecture; High performance computing; Image analysis; Scientific computing; Streaming media; Imagine stream processor; floating-point efficiency; performance bottleneck; performance evaluation; scientific applications;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing with Applications, 2008. ISPA '08. International Symposium on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3471-8
  • Type

    conf

  • DOI
    10.1109/ISPA.2008.17
  • Filename
    4725139