• DocumentCode
    2799558
  • Title

    A Resource Optimized Remote-Memory-Access Architecture for Low-latency Communication

  • Author

    Nüssle, Mondrian ; Scherer, Martin ; Brüning, Ulrich

  • Author_Institution
    Comput. Archit. Group, Univ. of Heidelberg, Heidelberg, Germany
  • fYear
    2009
  • fDate
    22-25 Sept. 2009
  • Firstpage
    220
  • Lastpage
    227
  • Abstract
    This paper introduces a new highly optimized architecture for remote memory access (RMA). RMA, using put and get operations, is a one-sided communication function which amongst others is important in current and upcoming Partitioned Global Address Space (PGAS) systems. In this work, a virtualized hardware unit is described which is resource optimized, exhibits high overlap, processor offload and very good latency characteristics. To start an RMA operation a single HyperTransport packet caused by one CPU instruction is sufficient, thus reducing latency to an absolute minimum. In addition to the basic architecture an implementation in FPGA technology is presented together with an evaluation of the target ASIC-implementation. The current system can sustain more than 4.9 million transactions per second on the FPGA and exhibits an end-to-end latency of 1.2 ¿s for an 8-byte put operation. Both values are limited by the FPGA technology used for the prototype implementation. An estimation of the performance reachable on ASIC technology suggests that application to application latencies of less than 500 ns are feasible.
  • Keywords
    application specific integrated circuits; field programmable gate arrays; parallel memories; ASIC; FPGA; HyperTransport packet; Partitioned Global Address Space systems; low-latency communication; resource optimized remote-memory-access architecture; time 1.2 mus; virtualized hardware unit; Application specific integrated circuits; Computer architecture; Delay; Electronics packaging; Engines; Field programmable gate arrays; Hardware; Memory architecture; Parallel processing; Protocols; device virtualization; high-performance computing; interconnection networks; remote memory access;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2009. ICPP '09. International Conference on
  • Conference_Location
    Vienna
  • ISSN
    0190-3918
  • Print_ISBN
    978-1-4244-4961-3
  • Electronic_ISBN
    0190-3918
  • Type

    conf

  • DOI
    10.1109/ICPP.2009.62
  • Filename
    5362316