• DocumentCode
    579755
  • Title

    Efficient Sorting on the Tilera Manycore Architecture

  • Author

    Morari, Alessandro ; Tumeo, Antonino ; Villa, Oreste ; Secchi, Simone ; Valero, Mateo

  • Author_Institution
    Pacific Northwest Nat. Lab., Richland, WA, USA
  • fYear
    2012
  • fDate
    24-26 Oct. 2012
  • Firstpage
    171
  • Lastpage
    178
  • Abstract
    We present an efficient implementation of the radix sort algorithm for the Tilera TILEPro64 processor. The TILEPro64 is one of the first successful commercial manycore processors. It is composed of 64 tiles interconnected through multiple fast Networks-on-chip and features a fully coherent, shared distributed cache. The architecture has a large degree of flexibility, and allows various optimization strategies. We describe how we mapped the algorithm to this architecture. We present an in-depth analysis of the optimizations for each phase of the algorithm with respect to the processor´s sustained performance. We discuss the overall throughput reached by our radix sort implementation (up to 132 MK/s) and show that it provides comparable or better performance-per-watt with respect to state-of-the art implementations on x86 processors and graphic processing units.
  • Keywords
    graphics processing units; network-on-chip; optimisation; shared memory systems; sorting; Tilera TILEPro64 processor; Tilera manycore architecture; commercial manycore processors; graphic processing units; networks-on-chip; optimization strategies; radix sort algorithm; shared distributed cache; Bandwidth; Computer architecture; Histograms; Instruction sets; Optimization; Sorting; Tiles;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
  • Conference_Location
    New York, NY
  • ISSN
    1550-6533
  • Print_ISBN
    978-1-4673-4790-7
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2012.41
  • Filename
    6374786