• DocumentCode
    3704251
  • Title

    Energy-Efficient Sorting with the Distributed Memory Architecture ePUMA

  • Author

    Andréas ;Joar Sohl;Dake Liu

  • Author_Institution
    Dept. of Electr. Eng., Linkoping Univ., Linkoping, Sweden
  • Volume
    3
  • fYear
    2015
  • Firstpage
    116
  • Lastpage
    123
  • Abstract
    This paper presents the novel heterogeneous DSP architecture ePUMA and demonstrates its features through an implementation of sorting of larger data sets. We derive a sorting algorithm with fixed-size merging tasks suitable for distributed memory architectures, which allows very simple scheduling and predictable data-independent sorting time. The implementation on ePUMA utilizes the architecture´s specialized compute cores and control cores, and local memory parallelism, to separate and overlap sorting with data access and control for close to stall-free sorting. Penalty-free unaligned and out-of-order local memory access is used in combination with proposed application-specific sorting instructions to derive highly efficient local sorting and merging kernels used by the system-level algorithm. Our evaluation shows that the proposed implementation can rival the sorting performance of high-performance commercial CPUs and GPUs, with two orders of magnitude higher energy efficiency, which would allow high-performance sorting on low-power devices.
  • Keywords
    "Sorting","Gold","Digital signal processing","Registers","Memory management","Hardware"
  • Publisher
    ieee
  • Conference_Titel
    Trustcom/BigDataSE/ISPA, 2015 IEEE
  • Type

    conf

  • DOI
    10.1109/Trustcom.2015.620
  • Filename
    7345636