• DocumentCode
    117252
  • Title

    A performance model of fast 2D-DCT parallel JPEG encoding using CUDA GPU and SMP-architecture

  • Author

    Shatnawi, Mohammed K. Ali ; Shatnawi, Hussein Ali

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Ottawa, Ottawa, ON, Canada
  • fYear
    2014
  • fDate
    9-11 Sept. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The performance of image compression algorithms for big data can be enhanced using parallel computations. JPEG algorithm is a lossy compression method that uses DCT to eliminate high-frequency components. In this paper, we describe a cross-compatible design of JPEG on SMD and GPU architectures. To achieve maximal efficiency, we exploit the substantial parallelism to design an optimized version of JPEG based on thread model. A fair algorithm´s evaluation on 24-bit BMP, using several performance metrics, is run on the fully optimized GPU using CUDA and SMP using SESC simulator. Our cross-architectural evaluation results revealed a 25.49 speedup in SESC and 21 in GPU and that CPU outperformed GPU for the JPEG.
  • Keywords
    Big Data; data compression; discrete cosine transforms; graphics processing units; image coding; parallel architectures; 24-bit BMP; 2D-DCT parallel JPEG encoding; CUDA GPU architecture; SESC simulator; SMP-architecture; big data; lossy image compression algorithms; Computer architecture; Graphics processing units; Integrated circuits; Transform coding; Amdahl Law; Fast DCT; GPU; JPEG; Parallel JPEG; SESC; SMP;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Extreme Computing Conference (HPEC), 2014 IEEE
  • Conference_Location
    Waltham, MA
  • Print_ISBN
    978-1-4799-6232-7
  • Type

    conf

  • DOI
    10.1109/HPEC.2014.7040947
  • Filename
    7040947