• DocumentCode
    2449392
  • Title

    OpenCL - An effective programming model for data parallel computations at the Cell Broadband Engine

  • Author

    Breitbart, Jens ; Fohry, Claudia

  • Author_Institution
    Res. Group Programming Languages / Methodologies, Univ. Kassel, Kassel, Germany
  • fYear
    2010
  • fDate
    19-23 April 2010
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Current processor architectures are diverse and heterogeneous. Examples include multicore chips, CPUs and the Cell Broadband Engine (CBE). The recent Open Compute Language (OpenCL) standard aims at efficiency and portability. This paper explores its efficiency when implemented on the CBE, without using CBE-specific features such as explicit asynchronous memory transfers. We based our experiments on two applications: matrix multiplication, and the client side of the Einstein@Home distributed computing project. Both were programmed in OpenCL, and then translated to the CBE. For matrix multiplication, we deployed different levels of OpenCL performance optimization, and observed that they pay off on the CBE. For the Einstein@Home application, our translated OpenCL version achieves almost the same speed as a native CBE version. Another main contribution of the paper is a proposal for an additional memory level in OpenCL, called static local memory. With little programming expense, it can lead to significant speedups such as factor seven for reduction. Finally, we studied two versions of the OpenCL to CBE mapping, in which the PPE component of the CBE does or does not take the role of a compute unit.
  • Keywords
    computer graphics; coprocessors; matrix multiplication; parallel programming; Einstein@Home distributed computing project; GPU; cell broadband engine; data parallel computations; explicit asynchronous memory transfers; matrix multiplication; multicore chips; open compute language; programming model; static local memory; Computer architecture; Computer languages; Concurrent computing; Distributed computing; Engines; Hardware; Multicore processing; Optimization; Parallel programming; Proposals;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
  • Conference_Location
    Atlanta, GA
  • Print_ISBN
    978-1-4244-6533-0
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2010.5470823
  • Filename
    5470823