• DocumentCode
    2010348
  • Title

    DMA Performance Analysis and Multi-core Memory Optimization for SWIM Benchmark on the Cell Processor

  • Author

    Dou, Yong ; Deng, Lin ; Xu, Jinhui ; Zheng, Yi

  • Author_Institution
    Nat. Lab. for Parallel & Distrib. Process., NUDT, Changsha, China
  • fYear
    2008
  • fDate
    10-12 Dec. 2008
  • Firstpage
    170
  • Lastpage
    179
  • Abstract
    The Cell processor is a typical heterogeneous multi-core processor, which owns powerful computing capability. But we are facing the challenges of ´memory wall´ in developing parallel applications, such as, limited capacity of local memory, limited memory bandwidth for multi-cores and the long latency for data communication. The DMA transfer mechanism is often used to hide the long latency and improve the effective usage of memory bandwidth. In the paper, we start with a series of DMA experimental tests in the context of the Cell processor architecture, and perform mathematical analysis to setup a unified formula on the average bandwidth of DMA by means of exponential fitting, which describes that SPE amount and DMA block size take main effects on DMA bandwidth in quantity. With the supports of the DMA performance formula, we perform 4 types of memory optimization in the process of parallelizing the SWIM benchmark program into a multi-core version. We take Sony PlayStation 3 (PS3) as our test-bed. For SWIM benchmark, with 6 SPE cores, we obtain over 13 times of speedup compared to single PPE, and 3.3 to 6.18 times to AMD and Intel CPU.
  • Keywords
    benchmark testing; computer architecture; data communication; mathematical analysis; microprocessor chips; optimisation; Cell processor architecture; DMA performance analysis; SWIM benchmark; SWIM benchmark program; data communication; heterogeneous multicore processor; mathematical analysis; memory bandwidth; multicore memory optimization; Bandwidth; Benchmark testing; Context; Data communication; Delay; Fitting; Mathematical analysis; Multicore processing; Performance analysis; Performance evaluation; DMA performance; SPEC parallelization; memory optimization; multi-core processor;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing with Applications, 2008. ISPA '08. International Symposium on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3471-8
  • Type

    conf

  • DOI
    10.1109/ISPA.2008.54
  • Filename
    4725147