• DocumentCode
    1829419
  • Title

    Vector Register Design with Register Bypassing for Embedded DSP Core

  • Author

    Ming-Yen Homh ; Wu, Jen-Ming

  • Author_Institution
    Dept. of Electr. Eng., Nat. Tsing Hua Univ., Hsinchu, Taiwan
  • fYear
    2012
  • fDate
    25-27 June 2012
  • Firstpage
    1033
  • Lastpage
    1038
  • Abstract
    In this paper, we address the register file design with Single Instruction Multiple Data (SIMD) for multimedia processing applications. In a 32-bit processor, for one data unit of 8-bit in width, one SIMD instruction can operate on four units at a time and thus reach data parallelism of four. The data units are regarded as subwords in SIMD processing. However, performance of SIMD is often restricted by ill subword permutation in register file. Therefore, we present a architecture of register file called Vector Register File (VRF) to improve the subwords permutation latency. Consequently, heavy data traffics between memory and register file can be avoided. A proprietary DSP core (codename Starfish) with simulation tool chain has been developed. The simulation and the debugging flow on the proprietary DSP core to evaluate the performance are presented. Several test benches, such as matrix transposition, deblocking filter, and discrete cosine transform (DCT) based on H.264/AVC, are applied for performance evaluation. A pipeline data hazard detection with register bypassing scheme is explored for VRF to further improve the pipeline efficiency. The simulation results show that, in average, we can improve cycle count by 29:87% and code size by 29:223%.
  • Keywords
    digital signal processing chips; discrete cosine transforms; embedded systems; matrix algebra; multimedia systems; multiprocessing systems; parallel memories; performance evaluation; pipeline processing; 32-bit processor; DCT; H.264/AVC; SIMD processing; Starfish simulator; VRF; data parallelism; deblocking filter; debugging flow; digital signal processor; discrete cosine transform; embedded DSP core; matrix transposition; memory file; multimedia processing applications; performance evaluation; pipeline data hazard detection; register bypassing; register file architecture design; simulation tool chain; single instruction multiple data; subword permutation; vector register design; Digital signal processing; Discrete cosine transforms; Hazards; Pipelines; Registers; Simulation; Vectors; Single Instruction Multiple Data; digital signal processor; multimedia computing; register file architecture;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
  • Conference_Location
    Liverpool
  • Print_ISBN
    978-1-4673-2164-8
  • Type

    conf

  • DOI
    10.1109/HPCC.2012.151
  • Filename
    6332287