• DocumentCode
    1996237
  • Title

    GPU Peer-to-Peer Techniques Applied to a Cluster Interconnect

  • Author

    Ammendola, Roberto ; Bernaschi, Massimo ; Biagioni, Andrea ; Bisson, Mauro ; Fatica, Massimiliano ; Frezza, Ottorino ; Lo Cicero, Francesca ; Lonardo, Alessandro ; Mastrostefano, Enrico ; Paolucci, Pier Stanislao ; Rossetti, Davide ; Simula, Francesco ; T

  • Author_Institution
    Sezione Roma Tor Vergata, Ist. Naz. di Fis. Nucleare, Rome, Italy
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    806
  • Lastpage
    815
  • Abstract
    Modern GPUs support special protocols to exchange data directly across the PCI Express bus. While these protocols could be used to reduce GPU data transmission times, basically by avoiding staging to host memory, they require specific hardware features which are not available on current generation network adapters. In this paper we describe the architectural modifications required to implement peer-to-peer access to NVIDIA Fermi- and Kepler-class GPUs on an FPGA-based cluster interconnect. Besides, the current software implementation, which integrates this feature by minimally extending the RDMA programming model, is discussed, as well as some issues raised while employing it in a higher level API like MPI. Finally, the current limits of the technique are studied by analyzing the performance improvements on low-level benchmarks and on two GPU-accelerated applications, showing when and how they seem to benefit from the GPU peer-to-peer method.
  • Keywords
    field programmable gate arrays; graphics processing units; peer-to-peer computing; peripheral interfaces; FPGA based cluster interconnect; GPU data transmission times; GPU peer-to-peer techniques; GPUs support; NVIDIA Fermi- and Kepler class GPU; PCI express bus; RDMA programming model; architectural modifications; exchange data; hardware features; host memory; peer-to-peer access; Bandwidth; Benchmark testing; Graphics processing units; Hardware; Peer-to-peer computing; Protocols; Switches; GPU; interconnection network; parallel computing; peer-to-peer;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
  • Conference_Location
    Cambridge, MA
  • Print_ISBN
    978-0-7695-4979-8
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2013.128
  • Filename
    6650959