• DocumentCode
    2423263
  • Title

    Level-3 BLAS on the TI C6678 Multi-core DSP

  • Author

    Ali, Murtaza ; Stotzer, Eric ; Igual, Francisco D. ; Van de Geijn, Robert A.

  • fYear
    2012
  • fDate
    24-26 Oct. 2012
  • Firstpage
    179
  • Lastpage
    186
  • Abstract
    Digital Signal Processors (DSP) are commonly employed in embedded systems. The increase of processing needs in cellular base-stations, radio controllers and industrial/medical imaging systems, has led to the development of multi-core DSPs as well as inclusion of floating point operations while maintaining low power dissipation. The eight-core DSP from Texas Instruments, codenamed TMS320C6678, provides a peak performance of 128 GFLOPS (single precision) and an effective 32 GFLOPS(double precision) for only 10 watts. In this paper, we present the first complete implementation and report performance of the Level-3 Basic Linear Algebra Subprograms(BLAS) routines for this DSP. These routines are first optimized for single core and then parallelized over the different cores using OpenMP constructs. The results show that we can achieve about 8 single precision GFLOPS/watt and 2.2double precision GFLOPS/watt for General Matrix-Matrix multiplication (GEMM). The performance of the rest of theLevel-3 BLAS routines is within 90% of the corresponding GEMM routines.
  • Keywords
    digital signal processing chips; embedded systems; linear algebra; matrix multiplication; message passing; power aware computing; OpenMP construct; TI C6678 multicore DSP; TMS320C6678; Texas Instruments; cellular base-station; digital signal processor; embedded system; floating point operation; general matrix-matrix multiplication; industrial imaging system; level-3 basic linear algebra subprograms routine; low power dissipation; medical imaging system; power 10 W; radio controller; Computer architecture; Digital signal processing; Kernel; Libraries; Linear algebra; Random access memory; System-on-a-chip; BLAS; DSPs; linear algebra;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
  • Conference_Location
    New York, NY
  • ISSN
    1550-6533
  • Print_ISBN
    978-1-4673-4790-7
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2012.26
  • Filename
    6374787